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PREFACE 


Why study geometry? 

This book aims to introduce the beginning or working physicist to a 
wide range of analytic tools which have their origin in differential geometry and 
which have recently found increasing use in theoretical physics. It is not uncom- 
mon today for a physicist’s mathematical education to ignore all but the sim- 
plest geometrical ideas, despite the fact that young physicists are encouraged to 
develop mental ‘pictures’ and ‘intuition’ appropriate to physical phenomena. 
This curious neglect of ‘pictures’ of one’s mathematical tools may be seen as the 
outcome of a gradual evolution over many centuries. Geometry was certainly 
extremely important to ancient and medieval natural philosophers; it was in 
geometrical terms that Ptolemy, Copernicus, Kepler, and Galileo all expressed 
their thinking. But when Descartes introduced coordinates into Euclidean 
geometry, he showed that the study of geometry could be regarded as an appli- 
cation of algrebra. Since then, the importance of the study of geometry in the 
education of scientists has steadily declined, so that at present a university 
undergraduate physicist or applied mathematician is not likely to encounter 
much geometry at all. 

One reason for this suggests itself immediately: the relatively simple geometry 
of the three-dimensional Euclidean world that the nineteenth-century physicist 
believed he lived in can be mastered quickly, while learning the great diversity of 
analytic techniques that must be used to solve the differential equations of 
physics makes very heavy demands on the student’s time. Another reason must 
surely be that these analytic techniques were developed at least partly in 
response to the profound realization by physicists that the laws of nature could 
be expressed as differential equations, and this led most mathematical physicists 
genuinely to neglect geometry until relatively recently. 

However, two developments in this century have markedly altered the balance 
between geometry and analysis in the twentieth-century physicist’s outloook. 
The first is the development of the theory of relativity, according to which the 
Euclidean three-space of the nineteenth-century physicist is only an approxi- 
mation to the correct description of the physical world. The second development, 
which is only beginning to have an impact, is the realization by twentieth-century 
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mathematicians, led by Cartan, that the relation between geometry and analysis 
is a two-way street: on the one hand analysis may be the foundation of the study 
of geometry, but on the other hand the study of geometry leads naturally to the 
development of certain analytic tools (such as the Lie derivative and the exterior 
calculus) and certain concepts (such as the manifold, the fiber bundle, and the 
identification of vectors with derivatives) that have great power in applications 
of analysis. In the modern view, geometry remains subsidiary to analysis. For 
example, the basic concept of differential geometry, the differentiable manifold, 
is defined in terms of real numbers and differentiable functions. But this is no 
disadvantage: it means that concepts from analysis can be expressed geometri- 
cally, and this has considerable heuristic power. 

Because it has developed this intimate connection between geometrical and 
analytic ideas, modern differential geometry has become more and more import- 
ant in theoretical physics, where it has led to a greater simplicity in the math- 
ematics and a more fundamental understanding of the physics. This revolution 
has affected not only special and general relativity, the two theories whose con- 
tent is most obviously geometrical, but other fields where the geometry involved 
is not always that of physical space but rather of a more abstract space of vari- 
ables: electromagnetism, thermodynamics, Hamiltonian theory, fluid dynamics, 
and elementary particle physics. 


Aims of this book 

In this book I want to introduce the reader to some of the more 
important notions of twentieth-century differential geometry, trying always to 
use that geometrical or ‘pictorial’ way of thinking that is usually so helpful in 
developing a physicist’s intuition. The book attempts to teach mathematics, not 
physics. I have tried to include a wide range of applications of this mathematics 
to branches of physics which are familiar to most advanced undergraduates. I 
hope these examples will do more than illustrate the mathematics: the new 
mathematical formulation of familiar ideas will, if I have been successful, give 
the reader a deeper understanding of the physics. 

I will discuss the background I have assumed of the reader in more detail 
below, but here it may be helpful to give a brief list of some of the ‘familiar’ 
ideas which are seen in a new light in this book: vectors, tensors, inner products, 
special relativity, spherical harmonics and the rotation group (and angular- 
momentum operators), conservation laws, volumes, theory of integration, curl 
and cross-product, determinants of matrices, partial differential equations and 
their integrability conditions, Gauss’ and Stokes’ integral theorems of vector 
calculus, thermodynamics of simple systems, Caratheodory’s theorem (and the 
second law of thermodynamics), Hamiltonian systems in phase space, Maxwell’s 
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equations, fluid dynamics (including the laws governing the conservation of 
circulation), vector calculus in curvilinear coordinate systems, and the quantum 
theory of a charged scalar field. Besides these more or less familiar subjects, 
there are a few others which are not usually taught at undergraduate level but 
which most readers would certainly have heard of: the theory of Lie groups and 
symmetry, open and closed cosmologies, Riemannian geometry, and gauge 
theories of physics. That all of these subjects can be studied by the methods of 
differential geometry is an indication of the importance differential geometry is 
likely to have in theoretical physics in the future. 

I believe it is important for the reader to develop a pictorial way of thinking 
and a feeling for the ‘naturalness’ of certain geometrical tools in certain situ- 
ations. To this end I emphasize repeatedly the idea that tensors are geometrical 
objects, defined independently of any coordinate system. The role played by 
components and coordinate transformations is submerged into a secondary 
position: whenever possible I write equations without indices, to emphasize the 
coordinate-independence of the operations. I have made no attempt to present 
the material in a strictly rigorous or axiomatic way, and I have had to ignore 
many aspects of our subject which a mathematician would regard as funda- 
mental. I do, of course, give proofs of all but a handful of the most important 
results (references for the exceptions are provided), but I have tried wherever 
possible to make the main geometrical ideas in the proof stand out clearly from 
the background of manipulation. 1 want to show the beauty, elegance, and 
naturalness of the mathematics with the minimum of obscuration. 


How to use this book 

The first chapter contains a review of the sort of elementary math- 
ematics assumed of the reader plus a short introduction to some concepts, par- 
ticularly in topology, which undergraduates may not be familiar with. The next 
chapters are the core of the book: they introduce tensors, Lie derivatives, and 
differential forms. Scattered through these chapters are some applications, but 
most of the physical applications are left for systematic treatment in chapter 5. 
The final chapter, on Riemannian geometry, is more advanced and makes con- 
tact with areas of particle physics and general relativity in which differential 
geometry is an everyday tool. 

The material in this book should be suitable for a one-term course, provided 
the lecturer exercises some selection in the most difficult areas. It should also be 
possible to teach the most important points as a unit of, say, ten lectures in an 
advanced course on mathematical methods. I have taught such a unit to graduate 
students, concentrating mainly on ὃς 2.1—2.3, 2.5-2.8, 2.12—2.14, 2.16, 2.17, 
2.19-2.28, 3.1-3.13, 4.1-4.6, 4.8, 4.14-4.18, 4.20-4.23, 4.25, 4.26, 5.1, 5.2, 
5 4—5.7, and 5.15—5.18. I hope lecturers will experiment with their own choices 
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of material, especially because there are many people for whom geometrical 
reasoning is easier and more natural than purely analytic reasoning, and for them 
an early exposure to geometrical ideas can only be helpful. As a general guide to 
selecting material, section headings within chapters are printed in two different 
styles. Fundamental material is marked by boldface headings, while more 
advanced or supplementary topics are marked by boldface italics. All of the last 
chapter falls into this category. The same convention of type-face distinguishes 
those exercises which are central to the development of the mathematics from 
those which are peripheral. 

The exercises form an integral part of the book. They are inserted in the 
middle of the text, and they are designed to be worked when they are first 
encountered. Usually the text after an exercise will assume that the reader has 
worked and understood the exercise. The reader who does not have the time to 
work an exercise should nevertheless read it and try to understand its result. 
Hints and some solutions will be found at the end of the book. 


Background assumed of the reader 

Most of this book should be understandable to an advanced under- 
graduate or beginning graduate student in theoretical physics or applied math- 
ematics. It presupposes reasonable facility with vector calculus, calculus of many 
variables, matrix algebra (including eigenvectors and determinants), and a little 
operator theory of the sort one learns in elementary quantum mechanics. The 
physical applications are drawn from a variety of fields, and not everyone will 
feel at home with them all. It should be possible to skip many sections on 
physics without undue loss of continuity, but it would probably be unrealistic 
to attempt this book without some familiarity with classical mechanics, special 
relativity, and electromagnetism. The bibliography at the end of chapter 1 lists 
some books which provide suitable background. 


I want to acknowledge my debt to the many people, both colleagues and 
teachers, who have helped me to appreciate the beauty of differential geometry 
and understand its usefulness in physics. I am especially indebted to Kip Thorne, 
Rafael Sorkin, John Friedman, and Frank Estabrook. I also want to thank the 
first two and many patient students at University College, Cardiff, for their com- 
ments on earlier versions of this book. Two of my students, Neil Comins and 
Brian Wade, deserve special mention for their careful and constructive sug- 
gestions. It is also a pleasure to thank Suzanne Ball, Jane Owen, and Margaret 
Wilkinson for their fast and accurate typing of the manuscript through all its 
revisions. Finally, I thank my wife for her patience and encouragement, par- 
ticularly during the last few hectic months. 


Cardiff, 30 June 1979 Bernard Schutz 


1 SOME BASIC MATHEMATICS 


This chapter reviews the elementary mathematics upon which the geometrical 
development of later chapters relies. Most of it should be familiar to most 
readers, but we begin with two topics, topology and mappings, which many 
readers may find unfamiliar. The principal reason for including them is to enable 
us to define precisely what is meant by a manifold, which we do early in chapter 
2. Readers to whom topology is unfamiliar may wish to skip the first two 
sections initially and refer back to them only after chapter 2 has given them suf- 
ficient motivation. 


1.1 The space R” and its topology 

The space R” is the usual n-dimensional space of vector algebra: a point 
in R” is a sequence of n real numbers (x1, x2,...,X;,), also called an n-tuple of 
real numbers. Intuitively we have the idea that this is a continuous space, that 
there are points of R” arbitrarily close to any given point, that a line joining any 
two points can be subdivided into arbitrarily many pieces that also join points of 
R" . These notions are in contrast to properties we would ascribe to, say, a lattice, 
such as the set of all n-tuples of integers (i,,i,,...,i,,). The concept of continu- 
ity in R” is made precise in the study of its topology. The word ‘topology’ has 
two distinct meanings in mathematics. The one we are discussing now may be 
called local topology. The other is global topology, which is the study of large- 
scale features of the space, such as those which distinguish the sphere from the 
torus. We shall have something to say about global topology later, particularly 
in the chapter on differential forms. But first we must take a brief look at local 
topology. 

The fundamental concept is that of a neighborhood of a point in R", which 

we can define after introducing a distance function between any two points 
X =(x1,...,X,) andy =(1,...,¥,) of R”: 

d(x,y) = [αι γι) + 2 1) ιν ανν) (1.1) 
A neighborhood of radius r of the point x in R” is the set of points N,(x) whose 
distance from x is less than r. For R? this is illustrated in figure 1.1. The 
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continuity of the space can now be more precisely defined by considering very 
small neighborhoods. A set of points of R” is discrete if each point has a neigh- 
borhood which contains no other points of the set. Clearly R” itself is not dis- 
crete. A set of points S of R” is said to be open if every point x in S has a neigh- 
borhood entirely contained in S. Clearly, discrete sets are not open, and from 
now on we will have no use for discrete sets. A simple example of an open set in 
R' (also known simply as R) is all points x for which a <x <b for two real 
numbers a and b. An important thing to understand is that the set of points for 
which a <x <b is not open, because the point x = a does not have a neighbor- 
hood entirely contained in the set: some points of any neighborhood of x =a 
must be less than a and therefore outside the set. This is illustrated in figure 1.2. 
This is, of course, a very general property: any reasonable ‘chunk’ of R” will be 
open if we do not include the boundary of the chunk in the set. 


Fig. 1.1. The distance function d(x, y) defines a neighborhood in R? 
which is the interior of the disc bounded by the circle of radius r. The 
circle itself is not part of this neighborhood. 





Fig. 1.2. (a) Any neighborhood of the point χα must include points 
to the left of a, while (b) any point to the right of a has a neighbor- 
hood entirely to the right of a. 
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The idea that a line joining any two points of R” can be infinitely subdivided 
can be made more precise by saying that any two points of R” have neighbor- 
hoods which do not intersect. (They will also have some neighborhoods which 
do intersect, but if we choose small enough neighborhoods we can make them 
disjoint.) This is called the Hausdorff property of R" . It is possible to construct 
non-Hausdorff spaces, but for our purposes they are artificial and we shall ignore 
them. 

Notice that we have used the distance function d(x, y) to define neighbor- 
hoods and thereby open sets. We say that d(x, y) has induced a topology on R". 
By this we mean that it has enabled us to define open sets of R” which have the 
properties: 

(Ti) if O, and O, are open, so is their intersection, Ο!  O,; and 

(Tii) the union of any collection (possibly infinite in number) of open sets is 

open. 

In order to make (Ti) apply to all open sets of R", we define the empty set (or 
null set) to be open, and in order to make (Tii) work we likewise define R” itself 
to be open. (In more advanced treatments one defines a topological space to be a 
collection of points with a definition of open sets satisfying (Ti) and (Tii). In 
this sense the distance function d(x, y) has enabled us to make R” into a topo- 
logical space.) 

At this point we must ask whether the induced topology depends very much 
on the precise form of d(x, y). Suppose, for example, that we use a different 
distance function 


d'(x, y) . [4(αι νι); + 0.1(x2 — yo) +... (x, —yn)]. (1.2) 
This also defines neighborhoods and open sets, as shown in figure 1.3 for R?. 


Fig. 1.3. The distance function d'(x, ν) Ξ- [4(κι -- γι); +) -γλ)2 12 
defines a neighborhood in R* which is the interior of the disc bounded 
by the ellipse 4(x, —y,)? +(x. —y,)° = r?. As in figure 1.1, the 
ellipse itself is not in the neighborhood. 
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The key point is that any set which is open according to d'(x, y) is also open 
according to d(x, y), and vice versa. The proof of this is not hard, and it rests on 
the fact that any given d-type neighborhood of x contains a d'-type neighbor- 
hood entirely within it, and vice versa. That is, given a d-type neighborhood of 
radius ε about x, one can choose a number 6 so small that a d’-type neighbor- 
hood of x of radius 5 is entirely within the original (see figure 1.4). So we can 
conclude that if a set is open as defined by d(x, y) it is also open as defined by 
d'(x, y), and vice versa. We therefore say that both d and d’ induce the same 
topology on ΑΙ. The reader may wish to show that the distance functions 


Fig. 1.4. In R? a d-neighborhood of radius € (bounded by the circle) 
entirely contains a d “neighborhood of radius § (bounded by the 
ellipse defined in figure 1.3) if 6 <e. If 6 > 2e the inclusion is reversed. 


X> 4 











Fig. 1.5. (a) In R? the distance function ἆ has circular neighborhoods 
smaller for a given radius 7, than those of d. (0) The neighborhoods of 
ἆ are bounded by squares of side 27. 


X4 χ2 











(a) (0) 
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d"(x,y) = exp [d(x, y)] --Ι, (1.3) 

d'"(x,y) = maximum (χι γή] x2 —yal,.--,1Xn —Pal) (1.4) 
also induce the same topology. Their neighborhoods in R? are illustrated in 
figure 1.5. So although we began with the usual Euclidean distance function 
d(x, y), the topology we have defined is not very dependent on the form of d. 
This is called the ‘natural’ topology of R”. Topology is a more ‘primitive’ con- 
cept than distance. We do not need to know the actual distance between points, 
since many different distance definitions will do. What we need is only a notion 
that the distance between points can be made arbitrarily small and that no two 
distinct points have zero distance between them. 

Our definition of a neighborhood was tied to a particular distance function, 
but because the topology of a manifold is more general than any particular dis- 
tance function the word ‘neighborhood’ is often used in a different sense. We 
will often find it convenient to let a neighborhood of a point x be any set con- 
taining an open set containing x. It should always be clear from the context 
which sense of ‘neighborhood’ is intended. 


1.2 Mappings 

The concept of a mapping, simple though it is, will be so useful later 
that it is well to spend some time discussing it. A map f from a space M toa 
space NV is a rule which associates with an element x of M a unique element y of 
N, It is useful to keep in one’s mind a general picture of a map, such as figure 
1.6. The simplest example of a map is an ordinary real-valued function on R. 
The function f associates a point x in R with a point f(x) also in R. (This illus- 
trates the fact that the spaces M and N need not be distinct.) Such a map is 
shown in the usual way in figure 1.7. Notice that the map gives a unique f(x) for 
every x, but not necessarily a unique x for every f(x). In the figure, both χο and 


Fig. 1.6. A pictorial representation of the mapping f: M > N showing 
x b> f(x). 
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x, map into the same value. Such a map is called many-to-one. More generally, if 
f maps M to N then for any set S in M the elements in V mapped from points of 
S form a set T called the image of S under f, denoted by f(S). Conversely, the 
set S is called the inverse image of T, denoted by f~'(T). If the map is many-to- 
one then the inverse image of a single point of Ν is not a single point of M, so 
there is no map f' from N to M, since every map must have a unique image. So 
in general the symbol f ‘(T) must be read as a single symbol: it is not the image 
of T under a map f™' but simply a set called f~'(T). On the other hand, if every 
point in f(S) has a unique inverse image point in S, then fis said to be one-to- 
one (abbreviated 1-1) and there does exist another 1-1 map Γ 1, called the 
inverse of f, which maps the image of M to M. These concepts, if not the words 
used to describe them, are familiar from elementary calculus. The function 

f(x) = sin x is many-to-one, since f(x) = f(x + 2nm) = f((2n + 1)n — x) for any 
integer n. Therefore, a true inverse function does not exist. The usual inverse 
function, arcsin y or sin” y, is obtained by restricting the original sine function 
to the ‘principal’ values, — 7/2 <x < 7/2, on which it is indeed 1—1 and invert- 
ible. 

Another example of a 1-1 map is a geographical map of part of the Earth’s 
surface: this maps a point of the Earth’s surface to a point of a piece of paper. 
Yet another map is a rotation of a sphere about some diameter: this maps a 
point of the sphere to another one a fixed angular distance away as measured 
about the axis of rotation. 

We shall now introduce some standard notation and terminology regarding 
maps. The statement that f maps M to Ν is abbreviated f: M -» N. The statement 
that f maps a particular element x of M to y of N has its own special notation, 
f: x ty. If the name of a map is f, the image of a point x is f(x). When the map 
is a real-valued function of, say, n variables (so f: R” > R), it is conventional 
among physicists to use the symbol f(x) to denote both the value of fon x and 
the function itself. When there is no chance of confusion we will follow that 
convention. If we have two maps, fandg,f: M—> Nand g: Ν-»β, then there is a 


Fig. 1.7. A many-to-one map (function) of κ toR. 
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map called the composition of f and g, denoted by g Of, which maps M to P 
(ο MP). This is defined in the obvious way: take a point x of M, find the 
point f(x) of N, and use g to map it to P: (ο f)(x) =g(f(x)). It is conventional 
to write the composition g © fin such a way that the map acting first is the one 
on the right. 

If a map is defined for all points of M, then we say it is a mapping from M@ 
into N. If, in addition, every point of NV has an inverse image (not necessarily a 
unique one), we say it is a mapping from M onto N. As mentioned above, if the 
inverse image is unique, the map is one-to-one. (A map which is both 1-1 and 
onto is called a bijection.) As an example, let V be the unit open disc in ΚΣ, the 
set of all points x for whieh d(x, 0) < 1 (where 0 is the origin of ΚΣ). Let M be 
the surface of the hemisphere ϐ < 7/2 of the unit sphere (see figure 1.8). Clearly 
there is a map f which is 1-1 from M onto N. 

The terminology of mapping theory, combined with what we have learned of 
topology, enables us to give a useful and compact definition of a continuous 
function, or in fact of any continuous map. A map f: M > N is continuous at x 
in M if any open set of NV containing f(x) contains the image of an open set of M. 
(This presupposes, of course, that M and N are topological spaces. Otherwise 
continuity has no meaning.) More generally, f is continuous on M (or, simply, 
continuous) if it is continuous at all points of Μ. Let us see how this is related to 
the usual elementary calculus definition of a continuous function. 

Suppose fis a real-valued function of one real variable. That is, fis a map of 
R to R, taking a number x to a number f(x). (In the notation above, f: R > R.) 
Then in the elementary calculus view f is continuous at a point x9 if for every 
e > 0 there exists a 5 > 0 such that | f(x) — f(xo)| < for all x for which 
|x —Xo| <6 (see figure 1.9). To re-express this in terms of open sets, notice that 
for R the distance function α (κ, Xo) defined in §1.1 above just reduces to 


Fig. 1.8. By imagining the disc to be the equatorial section of the ball 
bounded by the sphere, one constructs a simple map by projecting 
perpendicular to the disc. 
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d'""(x,X9)= |x —Xo|. Therefore this definition says that f is continuous at Xo if 
every d’’-neighborhood of f(xo) contains the image of ad"”’-neighborhood of 
Χρ. Since these neighborhoods are open sets, the new definition of continuity 
given in the previous paragraph contains the elementary-calculus definition as a 
special case. Conversely, the elementary-calculus definition implies the other 
because any open set in R containing f(x) contains a d"’’-neighborhood of f(x9), 
which in turn contains the image of an open set containing Χρ (namely, that of a 
d" neighborhood of x9). The two definitions are equivalent. 

The condition that a map be continuous on all of M is even easier to phrase, 
because it is a theorem that f: M -» N is continuous if and only if the inverse 
image of every open set of Vis open in M. The proof of this is not difficult. If f 
is continuous at all x, then the inverse image of any open set is open because it 
contains an open set containing each point in the inverse image. Conversely, if 
the inverse image of every open set of Ν is open, then it contains an open set 
about any of its points, so fis continuous at each of these points. 

The open-set definition of continuity is much easier to use and to understand 
than the ε-δ one, particularly for functions of more than one variable, and it is 
of course the only possible definition applicable to general maps between topo- 
logical spaces. 

Having defined continuity, we can go on to define differentiation of func- 
tions in the usual way. If f(x1,...,x,) is a function defined on some open 
region S of R”, then it is said to be differentiable of class Ο if all its partial 
derivatives of order less than or equal {ο k exist and are continuous functions on 
S. As a shorthand, such a function f is said to be a C” function. Special cases are 
C° (a continuous function) and 6 (a function, all of whose derivatives exist: 


Fig. 1.9. A continuous function, as defined in the text. Notice that f 
maps the neighborhood of χρ of radius 6 into that of f(xo) of radius ε, 
while the inverse image of the latter neighborhood includes the former 
but is not necessarily identical to it. It can contain other regions of the 
x-axis, such as the one on the right. If {15 continuous in this second 
neighborhood as well then the inverse image will be an open set. 
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usually called an infinitely differentiable function). Obviously, a function of 
class C® is also of class C’ for all 0 <j <k. It is also possible to define deriva- 
tives of more general continuous maps. These are usually called the differentials 
of the map. The interested reader may consult Choquet-Bruhat, DeWitt-Morette 
& Dillard-Bleick (1977), or Warner (1971) in the bibliography. 

If fis a 1-1 map of an open set M of R” onto another open set N of R”, it 
can be expressed concretely as 


Ji = fi(X1,X2,---5Xn), or y = f(x), 


where {x;,i=1,...,n}define a point x of Mand {y;,i=1,...,n} likewise 
define a point y of N. If the functions {f;,i=1,...,n} are all C* -differentiable, 
then the map is said to be C”-differentiable. The Jacobian matrix of a Οἱ map is 
the matrix of partial derivatives of;/0x;. The determinant of this matrix is simply 
called the Jacobian, J, and is often denoted by 


J = 0(f1,.--5fn)/0(%1, .-. 5 Xp). (1.5) 


If the Jacobian at a point x is nonzero, then the inverse function theorem assures 
us that the map fis 1-1 and onto in some neighborhood of x (see Choquet- 
Bruhat et al., 1977, for a proof). 
If a function g(x,,...,X,) is mapped into a function g,(11, ...,¥,) by the 
rule 
g,( fi, cae ,Xn)s ss πι, soe ,Xn)) = a(x, sae χι) 


(that is, g, has the same value at f(x) as g has at x), then the integral of g over M 
equals the integral of g,/J over NV: 


I, 2(x1,...,X,)dx,...dx, =|. διίνι,... η) ἁνι... dvy. (1.6) 


Since g and g, have the same value at appropriate points, it is often said that the 
volume-element dx,...dx,, has changed to J dy, ... dy,. This is a particularly 
useful point of view if the map fis viewed as a coordinate change. While this 
should be familiar to readers from the calculus of many variables, we will 
examine it in more detail in 92.25 and 84.8. 


1.3 Real analysis 
As just mentioned, it is assumed that the reader is familiar with the 
calculus of many variables. In this section we will cover just a few important 
points. 
A real function of a single real variable, f(x), is said to be analytic at x = Xo if 
it has a Taylor expansion about χο which converges to f(x) in some neighbor- 
hood of xo: 
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f(x) = f(Xo) (ία — Xo) = +—(x —x9)’ 7.2 
dx}... dx"), 


1 d°f 
+ 31 & -xoy(S) to... (1.7) 


Naturally, functions which are not infinitely differentiable at χο (i.e. for which 
(d"f/dx"),., does not exist for some 7) are not analytic. But there are infinitely- 
differentiable functions which are not analytic. A famous example is 

exp (— 1/x*), whose value and all of whose derivatives are zero at x = 0, but 
which is not identically zero in any neighborhood of x = 0. (This is explained by 
the fact that the analytic extension of this function into the complex plane has 
an essential singularity at z = 0; nevertheless, it is perfectly well-behaved on the 
real line.) However, it is reassuring to know that analytic functions are good 
approximations to many nonanalytic functions in the following sense. A real- 
valued function g(x;,...,X,,) defined on an open region S of R” is said to be 
square-integrable if the multiple integral 


I. [e(x1,...,X,)|* dx, dx,...dx, (1.8) 


exists. It is a theorem of functional analysis that any square-integrable function 
g may be approximated by an analytic function g’ in such a way that the integral 
of (g —g’)* over S may be made as small as one wishes. For this reason physi- 
cists typically do not hesitate to assume that a given function is analytic if this 
helps to establish a result, and we will follow this practice on occasion. Since a 
C™ function need not be analytic, there is a special notation for analytic func- 
tions: C“. Naturally, a ο function is C™. 

An operator A on functions defined on R” is a map which takes one function 
f into another one, 4( 1). If ACS) is just gf, where g is another function, then the 
operator is simply multiplicative. Other operators on functions on R might be 
simple differentiation, 


D(f) = daffex, 


for example, or integration using a fixed kernel function g, 


GUA) =] κα, 


or amore complicated operation like 
E(f) = f? + d°f/ax°. 
In each case the operator may or may not be defined on all functions f. For 


example, D may not be defined on a function which is not C!, while G is 
undefined on functions which give unbounded integrals. Specifying the set of 
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functions on which an operator is allowed to act in fact forms part of the defi- 
nition of the operator; this set is called its domain. 

The commutator of two operators A and B, called [A, B], is another operator 
defined by 


[A,B](f) = (AB—BA)(f) = (ο) μα). (1.9) 
If two operators have vanishing commutator, they are said to commute. Here 
one has to be careful about the domains of the operators: the domain of [A, B] 
may not be as large as that of either A or B. For example, if A = d/dx and 
B =x d/dx, then we we may take both their domains to be all C' functions. 
But for not all 6 functions will the successive operator A(B(f)) be defined, 
since it involves second derivatives. The operators AB and BA can be given all 63 
functions as domains, and this is a smaller set than C! functions. Then, at least 
at first, the commutator [A, B] also has only 67 functions in its domain. We can 
enlarge the domain (also called extending the operator) in this case, though not 
always, by the following observation. It is easy to work out that on any C? func- 
tion f 

[A,B](f) = [d/dx,x d/dx]f = df/dx, 
because the second derivatives in AB and BA cancel out. So we can identify [A, 
B] simply with d/dx (i.e. with A itself) and thereby extend its domain to all C’ 
functions. The lesson is that the commutator may be defined even on functions 
on which the products in the commutator are not. When dealing with differential 
operators it is often best, at least initially, to define their domain to be C™ func- 
tions. This will be our approach later, as it eliminates the need to worry about 
domains. 


1.4 Group theory 
A collection of elements G together with a binary operation (called -) is 
a group if it satisfies the axioms: 


(Gi) Associativity: if x,y, and z are in G, then 
x°(y+z) = (ery) ez. 
(Gii) Right identity: G contains an element e such that, for any x in G, 


xXx°e = xX, 


1 


(Giii) Right inverse: for every x in G there is an element called x, also in G, 


for which 
xxl =e. 
A group is Abelian (commutative) if in addition 


(Giv) x:y = y°x forallx,y inG. 
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A familiar exampie of a group with a finite set of elements is the group of all 
permutations of ή objects; the binary composition of two permutations is simply 
the permutation obtained by following one permutation by the other. This 
group has n! elements. Its identity element is the ‘permutation’ which leaves all 
objects fixed. 

We should note that a few simple conclusions can be deduced from (Gi)-(Giii): 
the identity element ὁ is unique; it is also a left-identity (e -x = x); the inverse 
element x! is unique for any x; and it is also a left-inverse (x~' + x =e). It is 
common to omit the symbol » when there is no risk of confusion: x + y is simply 
xy. 

The most important kind of group in modern physics is the Lie group, about 
which we will have much to say later. We will give a precise definition in chapter 
2, but here it is enough to say that it is a continuous group: any open set of 
elements of a Lie group has a 1-1 map onto an open set of R” for some n. An 
example of a Lie group is the translation group of κ” (x >x +a,a=const). 
Each point a of R” corresponds to an element of the group, so the group has in 
fact a 1-1 map onto all of κ”. The group composition law is simply addition: 
two elements a = (a@;,...,a,,) and b = (b,,...,b,) compose to form 
c=(a,+b,,...,a, + b,), denoted symbolically asc =a + b. This example 
illustrates the fact that one need not always use the symbol : to represent the 
group operation. With Abelian groups, as this one is, it is more common to use 
the symbol +. 

A subgroup S of a group G is a collection of elements of G which themselves 
form a group with the same binary operation. (The prefix ‘sub’ always denotes a 
subset having the same properties as the larger set. We shall encounter many 
‘subs’: vector subspaces, submanifolds, Lie subalgebras, and Lie subgroups.) As a 
group, a subgroup must have an identity element. Since the group’s identity e is 
unique, any subgroup must also contain e. In the example of the permutation 
group, one can invent many subgroups. The permutations of n objects which do 
not change the position of the first object form a subgroup of the permutation 
group of n objects because, (i) the identity e leaves the first object fixed; (ii) the 
inverse of such a permutation also leaves the first object fixed; and (iii) the 
composition of any twc such permutations still leaves the first object fixed. In 
fact this subgroup is identical to the group of permutations of n — 1 objects. The 
reader should try to prove that the set of all even permutations is also a sub- 
group of the permutation group, and that the set of all odd permutations is not 
a subgroup. 

Our statement that a certain subgroup of the permutation group of n objects 
‘is identical to’ the group of permutations of n — 1 objects is an example of a 
group isomorphism. Two groups, σι and G,, with binary operations » and * 
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respectively, are isomorphic (which just means identical in their group properties) 
if there is a 1-1 map fof σι onto (2 which respects the group operations: 


f(xy) = f(x)*f(). (1.10) 
The isomorphism f for our example is trivial: an element of the subgroup of the 
n-permutation group which permutes only the last n — 1 objects is mapped to 
the same permutation in the (n — 1)-permutation group. But an isomorphism is 
not always so trivial. Let G, be the group of positive real numbers with the 
operation of multiplication, and let G, be the group of all real numbers with the 
operation of addition. (Why are these groups?) Then if x is a number in G,, 
f(x) = log x defines a map f: σι > G, which satisfies (1.10): 

log(xy) = logx + log y. 
The two groups are isomorphic and f is an isomorphism. 

Another useful relation between groups is called a group homomorphism. This 
is like an isomorphism except that the map can be many-to-one and may only be 
into. (See 51.2 for terminology.) Equation (1.10) must still be satisfied. A trivial 
homomorphism of a group into itself is a map which maps every element of the 
group to the identity e. Less trivial is the homomorphism from the permutation 
group onto the multiplicative group whose only elements are {1, — 1}. This 
homomorphism maps any even permutation to 1 and any odd one to — 1. The 
reader should verify (1.10) for this example, i.e. that the composition of two 
odd permutations is even, that of an odd and even is odd, and that of two evens 
is even. 


1.5 Linear algebra 
A set V is a vector space (over the real numbers) if it has a binary oper- 
ation called + with which it is an Abelian group (see above) and if multiplication 
(-) by real numbers is defined to satisfy the following axioms (in which x and y 
are vectors and a, b real numbers): 
(Vi) a-(x+y) = @°x)+(@-y), 
(Vii) @+b)*x = @°x)+(b°x), 
(Vili) @b)-x = a-(b°x), 
(Viv) l-x = x. 
The identity element of V is called 0, or simply 0. Apart from the usual 
examples of vector spaces, note that the following are vector spaces: 
(i) The set of all n x n matrices, where ‘+’ means adding corresponding entries 
and ‘*’ means multiplying each entry by the real number. 
(ii) The set of all real continuous functions f(x) defined on the interval 
asxsb. 


δὲ 
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It is usual to drop the multiplication dot and parentheses used in these axioms. 
An expression like 

ax + by + cz (1.11) 
is called a linear combination of the vectors x, y, and z. A set of elements {x), 


Xo,.+-,Xm} of V is linearly independent if it is impossible to find real numbers 
{a1,@2,-.-.,@m}not all zero for which 

ἄιχι ta,X_ +... t+amXp = 0. (1.12) 
The set is a maximal linearly independent set if including any other vector of V 
in it would make it linearly dependent. By definition this means that any other 
vector in V can be expressed as a linear combination of elements of a maximal 
set, and so a maximal set forms a basis for V. For example, if V is the set of 
n Xn real matrices, then one basis is the collection of the n? different matrices 
that each have zeroes everywhere except for a one in a single entry. In general, 
the number of vectors in a basis is the dimension of V. (All bases have the same 


number of elements, if that number is finite.) Let the vectors {x,,i=1,...,n} 
be a basis. Then an arbitrary vector y is expressible as 
n 
y -- λ a;X;. (1.13) 
i=1 
The numbers {a;,i=1,...,n}are the components of y on this basis. 


A subspace of a vector space V is a subset of V that is itself a vector space. 
(Compare this with the definition of a subgroup in $1.4.) In particular, it must 
include the zero vector and all linear combinations of any of its elements. Any 
set of vectors {V1,...,Vm}is said to generate the subspace of V which is formed 
by all possible linear combinations 


αι} + 422 +... + amy m- 
If m <n this is necessarily a proper subspace, i.e. one not identical with V. In 
any case, the dimension of the subspace is the maximum number of linearly 
independent vectors among the generators. 

So far nothing has been said about inner products or magnitudes of vectors. 
These are additional concepts which may or may not be useful in particular 
applications involving vectors: there is no necessity to impose them on a vector 
space. One way to introduce them is to define a norm on a vector space. A 
normed vector space V is a vector space with a mapping from V into the real 
numbers (i.e. a function that assigns to every vector a real number called its 
norm), where the map satisfies the axioms 

(Ni) πα) > 0 for all ¥ in V, and n(*) = 0 if and only if x = 0; 

(Nii) n(ax) = |aln(X) for alla in R and x in V; 

(Niii) n& + y) <n) + n(y) for all x, ν in V. 
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There are many functions that can satisfy these axioms. Consider, for example, 
R" itself as a vector space, where vector addition is defined by 


x ty = (κι Ἔγι,... νΧη tn), (1.14) 
and multiplication by real numbers by 
ax = (axX,,...,@X,,). (1.15) 


Then, corresponding to three of the four distance functions defined in 81.1, we 
can define a norm, the ‘distance’ of a vector from the origin: 


n(x) = [(11)2 +2)? +...+,)7]"”, (1.16) 
n(x) = [4@1)? + 0.102)? +...+(,)7]"”, (1.17) 
π (κ) = maximum ([x4I, |xol,..-, χμ). (1.18) 


The reader should verify that each norm satisfies axioms (Ni)—(Niii). In addition, 
the reader should verify that α (κ, y) does not define a norm. 

The first two norms are distinguished from the third by satisfying an 
additional axiom which one may wish to impose, the parallelogram rule: 


(Niv) [n(x + )]* + [n@ —y)]? = 2[n&)]? 2)”. 
Such a norm permits one to define a bilinear symmetric inner product between 
two vectors 


xy = a[n@t+y)]? αμα —y))?’. (1.19) 
Bilinearity means: 

; (ax + by)+z = ax +z)+ by -2Z), (1.20) 

. 2 "(αχ + by) = a(z°x)+ b(z-y). (1.21) 


Symmetry means: 

x°y=yex. (1.22) 
In addition, the inner product is positive-definite, i.e. 

¥-X¥>0 and ¥°-xX =O onlyif x*=0. (1.23) 
This follows trivially since x *x = [n(x)]?. 

The norm n(x) on R” defined above is called the Euclidean norm. When we 
regard R” as a vector space with this norm we denote it by E” and call it n- 
dimensional Euclidean space. It is important to bear in mind the distinction 
between κ” and ΕΠΙ R” is simply the set of all n-tuples (x,,...,x,), without 
any implication of distance, vector properties, or norms. The purpose of making 
this distinction will become clear in chapter 2. 

To define the inner product and show it was bilinear and symmetric, only 
axioms (Nii) and (Niv) of norms are in fact necessary. A pseudo-norm is one 
which violates (Ni) and (Niii): the inner product of a vector with itself is not 
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necessarily positive. Special relativity is an example of a physical theory using a 
pseudo-norm, and we will look in some detail at it later. 

While we have defined only a vector space over the real numbers, we can just 
as easily define one over the complex numbers by allowing the numbers a and b 
in (Vi)—(Viv) to be complex. Then a vector will have complex components. Such 
vector spaces are commonly used in quantum mechanics. 


1.6 The algebra of square matrices 
A linear transformation T on a vector space V is a map from V onto 
itself which obeys the rule of linearity (cf. equations (1.20) and (1.21)) 


Tax + by) = aT(&) + BTC). (1.24) 
If we have a basis {é;,i=1,...,n}for V, then 
x= a; e;, (1.25) 
i=1 
1) = r(5 ιδ] = ¥ aM) 
1=1 1=1 
=) a; DTH; (1.26) 
i=1 J=1 


where we have replaced each vector Τ{έι) by its component form 2j_, T;;e;. The 
numbers T;; are called the components of T, and can be represented as a square 
n Xn matrix. 
A very important algebraic result with which the reader should be comfort- 
able is the following: 
n m m n 
} al’ BG) = } Cj ῥ by}. (1.27) 
i=1 j=1 j=. \i=1 
That is, the order in which the sums are performed makes no difference. Conse- 
quently, it is customary to write the above expression as 


Mis 
3 


ul 

μα 
ms 

ul 


A;BiC;, orsimply ), A;ByC;, (1.28) 
j=1 i,j 


l 
emphasizing that the sum is simply the sum of various products over all possible 
combinations of indices. 

Two successive linear transformations T and U acting on the space V produce 
the transformation UT: 


UT(x) = U(T(x)) 


= ul aa 
i,j 
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= VY aT Urner 
ijk 


=) al Tut) Cr. (1.29) 
ik j 
From this it follows that the components of UT are 


LT Vir: (1.30) 
J 


It is important to realize that if we represent Τη! as a matrix (i being the row 
index andj the column index), and similarly for U;,, then the sum (1.30) is just 
the matrix product of their respective matrices. Generally speaking, if A;; and Bj; 
are matrices, then their matrix products are 


(AB)in => ΑΒ = , BypA ij; (1.31) 
j j 


J j 


Notice that the third expression in equation (1.31) equals the second simply 
because each A;; and B;; is a number, and multiplication of numbers is commuta- 
tive. By comparing the third expression of equation (1.31) with the second of 
(1.32), we see that what is important is not the order of the factors but the 
positions of the summation index and of the free indices. The inequivalence of 
these two expressions means that matrix multiplication is generally not com- 
mutative. 
The transpose A’ of a matrix A has elements 

(45)ν = Aj. (1.33) 
(If A is complex then we define the adjoint A* of A by (A*),; =.,;, where a bar 
denotes complex conjugation.) The unit matrix, J, has ones on the main diagonal 
and zeroes elsewhere; this is symbolized by 

4) = Sy, (1.34) 
where 6,; is the Kronecker delta symbol, which is 1 if i =j, and 0 if ΤΕ]. The 
identity transformation is the one which maps any vector x into itself. It has 
components 6,; on any basis. The inverse A~* of a matrix A is a matrix such that 

A'A = AAt =I, (1.35) 
Not every matrix has an inverse, the zero matrix being an obvious one. When an 
inverse exists it is unique. Clearly A is the inverse of A™'. If 4 1 exists, A is said 
to be nonsingular. (Otherwise it is singular.) The set of all nonsingular n x n 
matrices forms a group with the operation of matrix multiplication. The group 
identity is the matrix /. This group is an extremely important Lie group called 
GL(n, R) and we will study it carefully in chapter 3. 
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The determinant of a2 x 2 matrix 


a b 
A= 
is called det (A), and is defined as 
det (A) = αἆ-- Ρο. (1.36) 
The determinant of an η x n matrix is defined by induction on (n — 1) x (n— 1) 
matrices by the following rule of cofactors. The cofactor of an element αι; of A 
is called a and is defined ας (— 1)'*” times the determinant of the (η — 1) 


X (n — 1) matrix formed by eliminating from A the row and column that αι 
belongs to. Thus, in the matrix 


a be 
4 ξια ο f (1.37) 
g h k 


the cofactor of a is ek — fh, while that of fis bg — ah. Then the determinant of 
A is defined to be 


n 
det (A) = [5 aya!) for any fixed i. (1.38) 
j=1 
For the matrix (1.37), taking 7 = 1 gives 
det (4) = a(ek — fh) + b( fg — dk) + c(dh — eg), 
while taking i = 2 gives 
det (4) = d(hc — bk) + e(ak — cg) + f(bg — ah), 
both of which are the same. This rule always looks very mysterious when pre- 
sented this way. It will make much more sense when we look at it again in 
84.12. 

The rows (or columns) of an n x n matrix may each be thought of as giving 
the components of a vector in some n-dimensional vector space. The determi- 
nant of a matrix vanishes if and only if the n vectors defined by its rows (or 
columns) are linearly dependent. This follows from some other properties of 
determinants: if a single row is multiplied by a constant A, the determinant is 
multiplied by \; if one row is replaced element-by-element by the sum of itself 
and any multiple of another row, the determinant is unchanged; and if any two 
rows are exchanged, the determinant changes sign. These properties are equally 
true if ‘row’ is replaced by ‘column’. Again, they will make more sense after 
studying §4.12. 

Suppose we construct a matrix B from a matrix A in such a way that 


bj; = a’'/det (4). (1.39) 
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Then (1.38) shows that 

n 

> jb = 1, for any fixed 7. 

r= 
It turns out (as experimentation can convince you) that 


» a:jDjn = Sir; 
1Ξ1 
or, in other words, that the inverse of A is the matrix B whose elements are given 
by (1.39). If follows that A is nonsingular if and only if det (A) #0. 
The trace of a matrix A is the sum of its diagonal elements: 


tr (A) = » Qij- (1.40) 


A similarity transformation of A bya nonsingular matrix B isa map A * BAB. 
The following list of useful formulae may be proved from the definitions we 
have given: 


(AB)' = BIA? (1.41) 
(AB)? = B'A™, (1.42) 

det (AB) = det (A) det (B), (1.43) 
tr(B'AB) = tr(A) (1.44) 
det (B'AB) = det (A), (1.45) 
det (41) = det (A). (1.46) 


The ή xn matrix A has an eigenvalue λ and an eigenvector V #0 if the 
following equation holds: 


A(V) = XV, (1.47) 


where on the left A acts as a linear transformation on V. In component form we 
can write this as the n equations 


Diy —iy)Vj = 0. (1.48) 


This has a nonzero solution for Vif and only if 

det (A --λΙ) = 0. (1.49) 
The eigenvalues of A are solutions to equation (1.49), which is clearly a poly- 
nomial equation of nth order. To any real solution λ there corresponds an eigen- 
vector. If a solution is complex there is no real eigenvector, so if V is in a real 
vector space there is a fundamental difference between real and complex eigen- 
values. If the vector space (and the matrix A) are complex, there is no particular 
distinction that need be made. Suppose {)\j,..., A,,} are the n roots of (1.49), 
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repeated as often as appropriate if they happen to be multiple roots. There are 
three important results which we shall quote: 


{eigenvalues of A7} = {eigenvalues of A}, (1.50) 
det (A) = λιλ»...λῃ, (1.51) 
tr (A) = Ay tA, +t... 4+A,. (1.52) 


The last two can be proved by inspecting the polynomial in (1.49) closely. The 
first follows from (1.46). 


1.7 
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and Topology, ed. C. DeWitt & B. DeWitt (Gordon & Breach, New 
York, 1964). Another valuable introduction is the article Differential 
geometry from a modern standpoint, by B. Schmidt in Relativity, 
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(Academic Press, New York, 1963); and Von Westenholz, Differential 
Forms in Mathematical Physics (North-Holland, Amsterdam, 1979). 


2 DIFFERENTIABLE MANIFOLDS AND TENSORS 


It is hard to imagine a physical problem which does not involve some sort of 
continuous space. It might be physical three-dimensional space, four-dimensional 
spacetime, phase space for a problem in classical or quantum mechanics, the 
space of all thermodynamic equilibrium states, or some still more abstract space. 
All these spaces have different geometrical properties, but they all share some- 
thing in common, something which has to do with their being continuous spaces 
rather than, say, lattices of discrete points. The key to differential geometry’s 
importance to modern physics is that it studies precisely those properties 
common to all such spaces. The most basic of these properties go into the defini- 
tion of the differentiable manifold, which is the mathematically precise substi- 
tute for the word ‘space’. 


2.1 Definition of a manifold 
Asin §1.1, we denote by R” the set of all n-tuples of real numbers 

(X1,%X2,..-.,X,). A set (of ‘points’) M is defined to be a manifold if each point 
of Μ has an open neighborhood which has a continuous 1-1 map onto an open 
set of R” for some n. (The reader unsure of what a 1-1 map onto something 
means should look at § 1.2.) This simply means that M is locally ‘like’ κ”. The 
dimension of M is, of course, n. It is important that the definition involves only 
open sets and not the whole of M and R”, because we do not want to restrict 
the global topology of M. This will be clear in the example of the sphere in §2.2. 
Notice that the map is only required to be 1—1, not to preserve lengths or angles 
or any other geometrical notion. Length is not even defined at this level of geo- 
metry, and we shall encounter physical applications in which we will not want 
to introduce a notion of distance between points of our manifolds. At this 
elementary (‘primitive’) geometrical level we are only trying to ensure that the 
local topology of our space (as described in §1.1) is the same as that of R”. A 
manifold is a space with this topology. 

By definition, the map associates with a point P of M an n-tuple (x; (P),..-, 
x,,(P)). These numbers x, (P), ... ,x,(P) are called the coordinates of P under 
this map, as illustrated in figure 2.1. One way of thinking about an 
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n-dimensional manifold is that it is simply any set which can be given n indepen- 
dent coordinates in some neighborhood of any point, since these coordinates 
actually define the required map to R”. We shall adopt the standard notation of 
writing the index of the coordinate as a superscript: x' (P), x*(P),...,x"(P) are 
the n coordinates of P (not powers of x(P)!) under the map. | 

From the discussion so far, we ought now to have a general idea of what a 
manifold is, but to do any better than this we must examine the nature of these 
coordinate maps. Suppose fis a 1-1 map from a neighborhood U of a point P of 
M onto an open set f(U) of R”. As stressed above, the neighborhood U does not 
necessarily include all of M (we shall see in 52.2 that on the sphere it cannot 
include the whole sphere), so there will be other neighborhoods with their own 
maps, and each point of M must lie in at least one such neighborhood. The pair 
consisting of a neighborhood and its map is called a chart. It is easy to see that 
these open neighborhoods must have overlaps if all points of M are to be in- 
cluded in at least one, and it is these overlaps which enable us to give a further 
characterization of the manifold (refer to figure 2.2). Suppose V is a neighbor- 
hood overlapping U, and that V has a map g onto an open region of R”. This 
open region may be completely distinct from the one that f maps U onto. The 
intersection of V and U is open (by axiom (Ti) of §1.1) and is given two dif- 
ferent coordinate systems by the two maps. There is thus some equation relating 
these coordinate systems. To find it, pick a point in the image of the overlap 
under f (i.e. a point in R”), say the point (x', x”, ...,x”)in figure 2.3. The 
map f has an inverse f~', so there is a unique point S in the overlap which has 
these coordinates under f. Now let g take us from S to another point in R”, say 


Fig. 2.1. A region U of M has a 1-1 map f onto a region f(U) of R”. 
This map associates any point, say P, with a unique n-tuple of numbers 
(x1,X2,...,X,). In this way U acquires a coordinate system, illus- 
strated by drawing the dashed lines that are the images under f' of 
the usual coordinate lines of R”. 
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(y!,y?,..., "). (What we have done is constructed the composite map of 
R” + R”, called go f.) In this way we obtain a functional relationship (a 
coordinate transformation) 


yi = γη) αλ. να”) 
γ2 = y*(x',x?,...,x") 
y” = y™(xl,x?,..., x”). 


If the partial derivatives of order k or less of all these functions { y'\ with respect 
to all the {x’! exist and are continuous, then the maps f and g (strictly, the 
charts (U, f) and (V, g)) are said to be C*-related. (This is the notation intro- 
duced in §1.2 for differentiability.) If it is possible to construct a whole system 
of charts (called, appropriately enough, an atlas) in such a way that every point 


Fig. 2.2. The neighborhoods U and V in M overlap (shaded area). Their 
respective maps to R”, f and g, give two different maps (hence two 
coordinate systems) to the overlap region. The relation between these 
coordinates characterizes the differentiability class of the manifold. 





Fig. 2.3. A magnification of figure 2.2, which shows how the overlap 
makes a map from R” to R”, which is f followed by g (called 
goof ). 
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of M is in at least one neighborhood and every chart is C”-related to every other 
one it overlaps with, then the manifold M is said to be a cr manifold. A mani- 
fold of class C! (which includes C” for k > 1) is called a differentiable manifold. 

The differentiability of a manifold endows it with an enormous amount of 
structure: the possibility of defining tensors, differential forms, and Lie deriva- 
tives. This differential structure is our main subject. Remember, we have not 
introduced the concept of distance on M, and we have no notion of the ‘shape’ 
or ‘curvature’ of M. We only know that locally it is smooth, and that is all we 
need for what follows. 

In most applications we will assume a C™ manifold, but usually this is not 
strictly necessary. There will be times when we shall find it convenient to assume 
an analytic manifold (C“ : the functions { y"} are analytic functions of {x'}), but 
this will be in the physicist’s spirit of invoking analyticity where convenient, as 
mentioned in 51.3. We will take the view that in learning this subject for the 
first time it is better to make rather strong assumptions about the manifold in 
order to see what is going on in the differential geometry. After the student is 
more comfortable with the subject he can worry about relaxing his assumptions. 
Accordingly, the reader should assume throughout the book that any manifold 
is sufficiently differentiable for whatever argument we happen to be using. 


2.2 The sphere as a manifold 

One of the simplest examples of a manifold, which illustrates the 
importance of allowing for more than one chart, is the sphere. (The word ‘sphere 
always means the surface of the sphere, not its interior.) Consider the two-sphere 
(called ιδ”), the set of points in R? for which (x' )* + (x?)? + («°)? = const. 
Any point has a sufficiently small neighborhood which as a 1-1 map onto a 
disc in R? (see figure 2.4). This shows that the map involved certainly will not 
preserve lengths or angles. As a specific example of a map, consider the usual 
spherical coordinates, with 6 =x' and ¢ =x*. Then the sphere appears to be 
mapped onto the rectangle 0 <x! <z,0<x? < 27, as shown in figure 2.5. But 
there are some funny features here. First, the map breaks down at the pole 
0 = 0, where one point is ‘mapped’ to the whole line x' = 0,0 <x* <27. So 


» 


Fig. 2.4. A small neighborhood of a point P on S? is mapped 1-1 onto 
a disc in κ". 
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at the pole there is not even a map. The second difficulty is that the points 
having @ = 0 are ‘mapped’ to two places, x? = 0 and x? = 2m: again, there is no 
map. To get around these problems we must restrict the map to the open region 
O<x' «πιο κ «2π. Then the two poles and the semicircle ¢ = 0 joining 
them are left out of the map. So here at least two maps are needed to cover 

the sphere completely. The second one could be another spherical coordinate 
system, this time with its line ¢ = 0 in the equator of the first system, say from 
o = 1/2 to @ = 3n/2. Then every point on the sphere is in at least one of these 
two charts. The overlap functions, expressing the second system’s coordinates 
in terms of the first one’s, will be complicated, but it should be clear that they 
will be analytic. So the sphere is an analytic manifold. 

A better map of S* onto a region of R?, which fails at only one point, is the 
so-called stereographic map of the sphere onto the plane, shown in figure 2.6 in 
a vertical cross-sectional view. The sphere is tangent to the plane, and a line is 
drawn from the point V on the sphere diametrically opposite the point of tan- 
gency. This line intersects the sphere at P, and R? at Q. This defines the map: 


Fig. 2.5. Ordinary spherical coordinates appear to give a map from S? 
to R? , which is good for ordinary points like P. But where is the image 
of the north pole? And which of two points is the image of Ο on the 
line 6 = 0? 











Fig. 2.6. The stereographic map of S* to R*. The set S? with the single 
point N removed is open, and this set is mapped onto all of R*. The 
map fails at N itself. 


N 
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P is mapped to Q, or in other words the coordinates of P in S* are just the 
coordinates of Q in R*. This map is 1-1 except at NV, for as the line from N 
becomes horizontal (P approaching /V), the point Q goes to infinity. But no 
matter in what direction in R* the point Q goes to infinity, the point P always 
approaches Ν. So Ν is mapped into all of ‘infinity’ and another coordinate 
patch must be used near NV. There is no mapping which is good on all of S*. 
Notice that this whole discussion really depends only on the global topology of 
S*: exactly the same remarks apply to the surface of, say a bowl or a wine glass, 
which are simply deformations of S?. On the other hand, the two-dimensional 
interior of the annulus bounded by two concentric circles in R* can be covered 
by a single coordinate patch. Try to find it! 


2.3 Other examples of manifolds 

The usefulness of the concept of a manifold really comes from its 
generality, the fact that it embraces sets which one might not ordinarily regard 
as spaces. By definition, any set M that can be parameterized continuously is a 
manifold whose dimension is the number of independent parameters. For 
example: 

(i) The set of all rotations of a rigid object in three dimensions is a manifold, 
since it can be continuously parameterized by the three ‘Euler angles’ (cf. 
Goldstein, 1950). 

(ii) The set of all (pure boost) Lorentz transformations is likewise a three- 
dimensional manifold; the parameters are the three components of the velocity 
of the boost. 

(iii) For N particles, the numbers consisting of all their positions (3N num- 
bers) and velocities (3N numbers) define a point in a 6N-dimensional manifold, 
called phase space. 

(iv) Given an equation (algebraic or differential) for a dependent variable y in 
terms of an independent variable x, one can define the set of all (y, x) to be a 
manifold; any particular solution is a curve in this manifold. This concept is 
easily extended to arbitrary numbers of dependent and independent variables. 

(v) A particularly common manifold is a vector space, whose definition is 
given in 51.5. (Here we are dealing with vector spaces over the real numbers.) Το 
see that such a space is a manifold we will construct a map from it to some R”. 


Suppose the vector space V is n-dimensional, and choose any basis {@,,..., @,,}. 
Any vector y is then representable as a linear combination 

Ψ = aé,+...4+4,6é,. (2.1) 
But y is a point V, so this establishes a map from V to R”, p' (a,,...,4,,). In 


fact every point of R” corresponds to a unique vector in V under this map, so 
not only is V covered entirely by the single coordinate system we have just 
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constructed, but I’ is identical, as a manifold, with R”. In the language of group 
theory (51.4), V and R” are isomorphic. This is an important result. It means 
that every vector space may be thought of, when convenient, simply as R”. 

(vi) Example (i) above is an example of a Lie group, which we are now in a 
position to define. A Lie group G is a group which is also aC” manifold, with 
the restriction that the group operation induces aC” map of the manifold into 
itself. What this means is the following. Pick out any element a of the group. 
This element induces a map of G into itself, taking any element b of G into ba, 
b +> ba. This map must be C™ ; in concrete terms in whatever coordinates are 
used on G, the coordinates of ba must be C™ functions of those of b. The 
demand for such a map is really a compatibility requirement, to ensure that the 
manifold property is compatible with the group property. In example (i) above, 
then, the set of all rotations forms a group, and it is not hard to show that this 
group structure is indeed compatible with the three-dimensional manifold struc- 
ture. (This Lie group is called SO(3).) This definition of Lie groups may seem 
abstract and perhaps rather arid at first, but we shall become much more familiar 
with them in chapter 3. A simple example of a Lie group is R”. It is a vector 
space (see (v) above) and therefore a group, and it is also a manifold: R” is in 
fact the simplest Lie group. 


2.4 Global considerations 

Because every manifold is locally the same as some R”, any two mani- 
folds of the same dimension (and same differentiability class) are locally indistin- 
guishable at this level of differential geometry. But this is certainly not the case 
when we consider their global structure, as the comparison of S? with R* in 
52.2 showed. Manifolds therefore divide up into classes according to their global 
properties. As an example, the sphere S* and the surface of a crayon have the 
same global structure. Although neither has a single map onto R*, each has a 
single perfectly good 1-1 map onto the other, as illustrated in figure 2.7. 


Fig. 2.7. A smooth (C~) crayon can be mapped 1-1 onto a sphere 52 
The map is global, not restricted to patches. It is a diffeomorphism, 
and so is its inverse. 
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(Strictly speaking, the crayon should have very smooth edges to be identical to 
S* asaC™ manifold.) Such a map directly from one C™ manifold M to another 
N, which is 1-1 and C™ (a map is C™ if the coordinates of a point in NV are 
infinitely differentiable functions of the coordinates of the inverse image of 

the point in M) and whose inverse is also C™, is called a diffeomorphism of M 
onto NV. The manifolds M and WN are said to be diffeomorphic if such a map 
exists. The surface of a teacup is diffeomorphic to the torus (doughnut) because 
each has just one hole: one can smoothly deform one into the other. 

Most of the geometry we will study in this book will be local, depending only 
on the differential structure. But there will be occasions, such as our studies of 
fiber bundles and of integration of functions, when the global properties of our 
manifolds will become very important. 


2.5 Curves 

Curves in the manifold will be of great importance to us. One’s ordinary 
idea of a curve is that it is a continuous series of points in M. It is convenient 
here to make a somewhat different definition: a curve is a (differentiable) map- 
ping from an open set of 1 into M (see figure 2.8). Thus, one associates with 
each point of κ) (which is a real number, say A) a point in M, which is called the 
image point of X. The set of all image points is the ordinary notion of the curve, 
but our definition gives each point a value of A. Clearly we have a parameterized 
curve, with parameter λ. Thus, two curves are different even if they have the 
same image in M, provided they assign a different parameter value to the image 
points. Again, by a ‘differentiable’ mapping we simply mean that the coordinates 
of the image point, {x'(A), i= 1,..., n} are differentiable functions of X. 


2.6 Functions on M 
A function on M is a rule that assigns a real number (the value of the 
function) to each point of M. When a region of M is mapped differentiably onto 
Fig. 2.8. A curve in M is a map from ΑΣ into M. The point Xin R! is 


mapped to P in M. The image of the open interval from a to b in R! 
is the line shown in M. 


( ) R 1 


2.7 Vectors and vector fields 3] 


a region of R”, the function becomes a function on R” (see figure 2.9). If this 
function is differentiable in R", then it is said to be a differentiable function on 
M. We can say the same thing in another way: abstractly, the function may be 
written as f(P), where P is a point of M. But P has coordinates, so one can ex- 
press the value of the function by some algebraic expression f(x',x?,...,x”). 
Then if this expression is differentiable in its arguments, the function is differ- 
entiable. The coordinates themselves, of course, are continuous and infinitely- 
differentiable functions. For example, x° is the function such that x° (P) is the 
value of the third coordinate of the point P. 

From now on we shall avoid referring to the mapping from M to R” directly, 
although we shall occasionally refer to the coordinates (which describe the 
mapping). The purpose of discussing mappings up till now has been to establish 
the fundamental concepts in as precise a way as possible. From now on we shall 
be more interested in using these concepts to develop the differential structure 
of the manifold, so we will always assume that we can place coordinates {', 
i=1,...,n}on the manifold, and that any sufficiently-differentiable set of 
equations y’ = y'(x’) which is locally invertible (i.e. whose Jacobian is nonzero — 
see $1.2) constitutes an acceptable coordinate transformation to new coordin- 
ates {yi=1,..., Πλ. 


2.7 Vectors and vector fields 
Consider a curve passing through the point P of M, described by the 
equations x’ = x'(A), i= 1,...,. Consider also a differentiable function 


f(x',...,x”) (abbreviated f(x’)) on M. At each point of the curve, f has a 
value. Therefore, along the curve there is a differentiable function g(A) which 
gives the value of fat the point whose parameter value is i: 


Fig. 2.9. The function fon M is a map from M to ΚΣ. The coordinate 
map g from a region U of M containing P onto a region g(U) of R” has 
an inverse. The composite map f © σα gives a map from R” toR’, 
which is a function on R”. This is just the expression of f(P) in terms of 
the coordinates of P. 
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g(r) = F&A), ....x"Q) = F&A). 


Differentiating and using the chain rule gives 


d dx? 9 
wily wo (2.2) 
dv ; ἆλ Ox’ 

This is true for any function g, so we can write 
d dx’ ὃ 

¢ — = —_——, 2.3 
dv ; ἆλ Ox’ (2.3) 


Now, in the ordinary view of vectors in Euclidean space, one would say that 
the set of numbers {dx'/dA} are components of a vector tangent to the curve 
x'(A); one can see this by realizing that {dx’! are infinitesimal displacements 
along the curve, and that dividing them by dA only change the scale, not the 
direction, of this displacement. In fact, since a curve has a unique parameter, to 
every curve there is a unique set {dx’/dd}, which are then said to be components 
of the tangent to the curve. Thus, with our definition of a curve, every curve has 
a unique tangent vector. 

Of course, every vector is the tangent to an infinite number of different 
curves through P, for two different reasons. The first is that there are many 
curves which are tangent to one another and have the same tangent vector at P, 
and the second is that the same path may be re-parameterized in such a way as to 
give the same tangent at P. These are illustrated in figure 2.10. As an example of 
this, consider the simple curve x’(A) = Aa’, where the numbers {a'! are constants. 
Then if P is the point \ = 0, the tangent there is dx‘/d\ = a’. Another curve, 


Fig. 2.10. (α) Two curves having the same tangent vector. (b) Two 
curves having the same path but different parameterizations. If the maps 
are called hy and ᾖλ, then the map ἠ2 © h, gives a relation between 

the two parameters, Ay  λο(λι). If dA, /dA, = 1 at P the two tangent 
vectors will be the same at P. 
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x'(u) = 2b! + pa’, also passes through P at up = 0 and has the same tangent 
vector there, dx’/du = a’. A re-parameterization of the first curve, καὶ = (w 

+ p)a', passes through all the same points and at P (u = 0) has the same tangent, 
dx'/du = a’. So each vector really characterizes a whole equivalence class of 
curves at that point. 

This use of the term ‘vector’ relies on familiar concepts from Euclidean 
space, where vectors are defined by analogy with displacements Ax’. However, 
since manifolds need have no distance relation between points, we shall need a 
definition of vector which relies only on infinitesimal neighborhoods of points 
of M. Suppose a and b are two numbers, and x’ = x'(y) is another curve through 
P. Then at P we have 


ο για =F det oe ο 
dv du ; dav du Ox!” 


Now, the numbers {adx'/dA + bdx'/ du} are components of a new vector, which 
is certainly the tangent to some curve through P. So there must exist a curve 
with parameter, say, ϕ such that at P 


d dx? dx? \ a 
—_— = a—+b—|—. 
ἀφ = \" dy du } dx? 


Collecting these results, we get, at P, 
d d d 


a ar +b di = dp 
Therefore, the directional derivatives along curves, like d/dA, form a vector space 
at P.! There are in any coordinate system special curves, the coordinate lines 
themselves. The derivations along them are clearly 0/dx’, and equation (2.3) 
shows that any d/dA can be written as a linear combination of the particular 
derivatives 0/0x’. It follows that {0/dx"} are a basis for this vector space. Then 
(2.3) shows that d/dA has components {dx'/dd} on this basis. We therefore have 
the remarkable result that the space of all tangent vectors at P and the space of 
all derivatives along curves at P are in 1-1 correspondence. For this reason the 
mathematician says that d/dA is the tangent vector to the curve x'(A). We shall 
adopt this point of view, since it has three advantages. First, it is precise, since it 
does not involve displacements over finite separations. Second, it makes no 


t The derivatives must, of course, obey the other axioms of § 1.5 if they are to 
form a vector space, but closure under linear combinations is the only nontrivial 
one. 
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mention of coordinates; in particular, it does not rely on notions like “transforms 
the same way as... .’. Third, a derivative is a kind of ‘motion’ along the curve, 
which is what, conceptually, a tangent vector generates; this association of a con- 
cept from analysis — the derivative — with one from geometry — the vector — 
has very powerful consequences. 

One can still maintain the same ‘picture’ of a vector as an arrow tangent to 
the curve, since the components are just the same. Now, however, one must 
realize that only vectors at the same point P can be added together. Vectors at 
two different points have no relation with one another. The vectors lie, not in M, 
but in the tangent space to M at P, which is called Tp. For ordinary manifolds, 
like the surface of a sphere, this tangent space is easy enough to visualize as a 
plane tangent to the sphere at that point. For more abstract manifolds it may be 
harder. 

We shall use the term vector to refer to a vector at a given point P of M. The 
term vector field refers to a rule for defining a vector at each point of M. 


2.8 Basis vectors and basis vector fields 
At any point P, the space ΤΡ is a vector space with the same dimension 

n as the manifold. Any collection of linearly independent vectors in Tp is a 
basis for Tp. By choosing a basis in each Τρ for all points P of M, we arrive at a 
basis for vector fields. If we have a coordinate system {x’! in a neighborhood U 
of P, then the coordinates define the coordinate basis {0/dx*} at all points in U. 

But one need not use the coordinate basis; one could refer vectors to some 
arbitrary basis {@;}. Here the subscript { is used as a label to distinguish one basis 
vector from another. It does not denote the component of anything. At a point 
P, an arbitrary vector V can be written as 


- . 0 η 
V= ον σα τὸν) ει 
J 


The numbers {7/1} are the components of V on {0/0x'}. The numbers {V/ } are 
the components of V on {é;}, and are related to V' by the usual vector transfor- 
mation laws, which we will deal with later. If V and the bases {0/ax‘} and {,} 
are regarded as vector fields, then the components {V"} and {V! 4 of the field V 
are functions on M. A vector field is said to be differentiable if these functions 
are differentiable. 

We have implicitly assumed above that the vectors {0/0x'} of an arbitrary 
coordinate system are in fact all linearly independent at any point P of U. What 
justification do we have for this? We shall show that this is just the condition 
for the coordinates to be good coordinates at P, i.e. for them to provide a 1-1 
map of some neighborhood U of P onto a region V of κ”. Consider a set of 
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coordinates on U which are good, say {y',i=1,..., 7}. Then the map from 
(x',...,x”) to U can be expressed by the equations 
yl = γα... x"), f= 1...ν Π. 


By the inverse function theorem (§1.2) this map is 1—1 (has an inverse) in U if 
and only if the Jacobian matrix dy//dx' has a nonvanishing determinant. This 
means that at any point of U the vectors whose components are (dy'/dx’ , 
dy7/dx',..., oy"/dx"), (Oy'/dx?, dy?/dx?,..., ay"/ax?),..., (Oy fax”, 
dy7/dx",..., dy"/dx”) are linearly independent. But these are just the com- 
ponents of the vectors {0/0x', i= 1, ...,} on the coordinate basis of the {y'} 
system, because by the chain rule 

2 aya wa, ara 

1 14,,1 1 312 0 "°° ly,,n? 

Ox ox oy Ox’ oy Ox” Oy 
and similarly for the other x's. So {x'} is in fact a good coordinate system in U 
if and only if {0/dx"} are a basis for vectors at each point of U. The reader may 
wish to look at the basis vectors of the spherical coordinates on the sphere to see 
how they go bad at the poles. 


2.9 Fiber bundles 

A particularly interesting manifold is formed by combining a manifold 
M with all its tangent spaces ΤΡ. This is illustrated in figure (2.11) for the sim- 
plest case: a one-dimensional manifold M (a curve) and its tangent spaces (lines 
tangent to it at each point). In part (a) of figure 2.11 we draw the curve and a 
few tangent spaces; these are lines drawn tangent to the curve, and each must be 
thought of as extending infinitely far in both directions in order to allow for 
vectors of arbitrary length at each point. Now, when drawn this way the picture 


Fig. 2.11. (α) A one-dimensional manifold and some of its tangent 
spaces. (b) The same, with the tangent spaces drawn parallel to one 
another to avoid spurious intersections. 
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can be messy, since the various tangent spaces intersect one another and the 
curve M haphazardly: these spurious intersections have no meaning. A better 
way of drawing the picture is as in figure 2.11(b), where the tangent spaces are 
drawn parallel: they do not intersect each other, and they cross M only at the 
point where they are defined. This picture unfortunately does not show the fact 
that each Tp is ‘tangent’ to the curve, but that is the price to be paid for clarity. 
Each point on the vertical line Τρ represents a vector, having that ‘length’ and 
being tangent to M at P. Figure 2.11(b) also shows something else: every point 
in the figure (a two-dimensional manifold) is a point of one and only one tan- 
gent space for M, say Τη for point R of M. To each point in that figure there is 
one and only one vector at one and only one point at M. So one is led to define 
anew manifold 7M, consisting of all vectors at all points, which is thus two- 
dimensional. It is called a fiber bundle, and the fibers are the spaces Τρ for each 
P. The term ‘fiber’ comes from drawing pictures like figure 2.11(b) above. To 
see that ΤΗ is indeed a two-dimensional manifold, let us construct a coordinate 
system for a portion of it. Let the one-dimensional manifold M have coordinate 
x, and let us find coordinates for the tangent spaces to points of M in the region 
a<x <b for some a and b, assuming that the coordinate x itself is a good co- 
ordinate in this interval. (The reason for this assumption will be evident in §2.11 
below.) Any tangent vector V at any point P can be written as 


V = ya/ox, (2.4) 


so that the component y is a coordinate for Tp (cf. equation (2.1)). It clearly is 
a good coordinate over the whole fiber ΤΡ. Since each fiber has a fixed value of 
x, the coordinates (x, y) locate a particular vector (y) tangent to a particular 
point (x). Since every point of the fiber bundle must by definition lie in a region 
of this sort, we have proved that this 7M is a manifold. Clearly, the construction 
is easily generalized to tangent fiber bundles of higher-dimensional manifolds. A 
coordinate system of this sort, in which coordinates for Tp are determined by 
those on M at P by expressing a vector on the coordinate basis (2.4) is called a 
natural coordinate system for TM. 

Now, the curve in the fiber bundle drawn as a dashed line in figure 2.12 
identifies a particular vector at each point of M, and so the curve defines a vector 
field on M. Such a curve (i.e. one which is nowhere parallel to a fiber) is called a 
cross-section of TM. Clearly, it is not usually meaningful to ask for the ‘length’ 
of the curve, and so here we have an example of a manifold on which one 
usually would not bother to define a metric. 

A general fiber bundle consists of a base manifold, which in our case is the 
curve M, and one fiber attached to each point of the base space. If the base space 
is n-dimensional and each fiber is m-dimensional, then the bundle has m + n 
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dimensions. It is a special kind of manifold, since it has the property of being 
decomposable into fibers: the points ofa single fiber are related to one another 
while points on different fibers are not. This is formalized by defining a pro- 
jection map m, which maps any point of a fiber to the point of the base manifold 
the fiber is attached to. A general manifold does not have such a projection 
defined on it. The following examples illustrate the wide variety of spaces de- 
scribable as fiber bundles. 


2.10 Examples of fiber bundles 

(i) The fiber bundle TM we have illustrated consists of a manifold and 
its tangent spaces, and is called the tangent bundle. It is one of the most impor- 
tant abstract manifolds in physics. For an n-dimensional manifold, TM has 2n 
dimensions. 

(ii) Later in this chapter we will generalize from vector fields to tensor fields. 
There are corresponding bundles over any differentiable manifold for every type 
of tensor. 

(iii) The fibers need not be related to the differential structure of the base 
space. Consider the ‘internal’ variables describing the state of an elementary 
particle, such as isospin. A bundle whose fibers are isospin space and whose base 
space is spacetime is capable of describing both the position variables (x, y, Ζ, 1) 
of the particle and its internal (isospin) state. 

(iv) The view of spacetime taken by Newtonian physics has a natural fiber- 
bundle structure. To Newton and Galileo, time was absolute: everyone can agree 
what events are simultaneous, no matter where they occur. We can therefore 
construct a bundle whose base space is R! (time) and whose fibers are R° 
(space). This is illustrated in figure 2.13. There is no natural relation between 
points on different fibers (points of space at different times), because Newtonian 
physics has no ‘absolute space’: two different observers moving with respect to 
each other disagree as to what constitutes a fixed point of space. So there is no 
natural fiber structure with R? as a base, while there is with ΚΣ. One effect of 


Fig. 2.12. A cross-section (dashed line) of the fiber bundle TM of a one- 
dimensional manifold M (heavy line). 
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Einstein’s relativity was to destroy this bundle structure and to substitute some- 
thing else, a metric structure (see §2.31 below). 


2.11 A deeper look at fiber bundles 

There are two related aspects of fiber bundles which we should con- 
sider in order to appreciate the richness and usefulness of the bundle concept. 
These are their global properties and the importance of groups in their con- 
struction. 

To understand the interesting global properties fiber bundles can have, we 
must first define a simpler concept, the product space. Two spaces M and N have 
an associated (Cartesian) product space M x N consisting of all ordered pairs (a, 
b) with a in M and b in N. For example, R? is defined as the product R' x ΚΙ. If 
M and N are manifolds, M x N is also a manifold in an obvious way: the set of 
coordinates {x',i= 1,...,m} ofan open set U of M, taken together with 
{y!,i=1,...,n} of an open set V of N, form a set of m +n coordinates for the 
open set (U, V) of M x N. It is clear from our construction of fiber bundles 
above that they are, at least locally, product spaces, the product U x F of an 
open set U of the base manifold B with the space F representing a typical fiber 
(all fibers being identical to F’). This in fact forms part of the definition of a 
fiber bundle: it is Jocally trivial (it is a product space when we look at a local 
region of B). The interesting question is whether it is globally trivial: whether 
the whole fiber bundle can be represented as the product B x F. 

The answer is usually no, and we give two examples which illustrate what 
both the question and the answer mean. 

(i) Consider TS”, the tangent bundle of the two-sphere ιο”. If it were globally 
trivial, there would be a C™ 1-1 map (a diffeomorphism) of TS? onto S? x R?, 
since the typical fiber is R’ , the tangent plane. Consider the set of points in S” 

x R? of the form (P, V), where P is an arbitrary point of S? and V is a given 
fixed vector in κ. Then the inverse of the above map gives a nowhere-zero 
cross-section of TS”, i.e. a definition of a C™ vector field on S* which is 


Fig. 2.13. The natural bundle structure of Newtonian (Galilean) space- 
time, which is ‘sliced up’ into moments of constant universal time. 
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nowhere zero. But in fact there is no C™ vector field on S* which is nowhere 
zero. This is a consequence of the famous but difficult fixed-point theorem of 
the sphere, that every 1-1 map (diffeomorphism) of S* onto itself leaves at least 
one point of S? fixed. A nowhere-zero vector field would generate such a map 
with no fixed point, as we explain in §3.1 below. Therefore TS? does not have 
a global product structure. This is an example in which the bundle is nontrivial 
because of the topology of the base manifold, S*. 

(ii) The second example shows that one can actually make a bundle nontrivial 
even if the base space allows a trivial bundle. Consider TS', the tangent bundle 
of the circle S'. Unlike S*, the circle does allow a continuous nowhere-vanishing 
vector field, and TS" is identical to the product space οἱ x R, as shown in figure 
2.14. This is just the global version of the local picture shown in figure 2.1 1(0). 
But suppose we ‘cut’ the circle at P in figure 2.14 and unwrap the bundle, lying 
it flat, as in figure 2.15. To reconstruct figure 2.14 from 2.15 we simply identify 
point a with a’, P with P’, b with b’, and so on. But we can reassemble the fiber 
bundle a different way by forming a Mobius band: identify α with b’, P with P’, 
b with a’, and so on. This gives the strip a twist so that it looks like figure 2.16 
when joined together. Locally it is still the same as figure 2.11(b); in fact the 


Fig. 2.14. The trivial way of constructing TS' as the product space of 
the circle S' and the typical fiber R! (drawn vertically). Cf. figure 
2.11(2). 


CLD» 
P 


Fig. 2.15. TS! cut along one fiber and laid flat. The fibers extend infi- 
nitely far in the vertical direction. 
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bundle over any connected open proper subset of S’ (‘proper’ means not iden- 
tical to S') has a 1-1 continuous map onto the same portion of figure 2.14. One 
has to go all the way around to see that there is no continuous 1-1 map of all of 
one bundle onto all of the other. Therefore, the M6bius band is not a product 
space, and the second bundle is nontrivial. Nontrivial constructions of bundles 
in analogous ways are used in modern particle physics to define the so-called 
‘instantons’. 

The Mobius example has a lesson for us: it is not sufficient simply to say 
what the base and fiber of a bundle are, because there may be more than one 
way to construct such a bundle. We need a better definition of a fiber bundle, 
and this is where groups come in. The difference between the two bundles over 
5) is in what is called the bundles’ structure group. To phrase the full definition 
of a fiber bundle more compactly, we need to define a homeomorphism, which 
is simply a 1-1 map from one space onto another, which is continuous and 
whose inverse is continuous.’ (For an explanation of the terminology of maps, 
see §1.2.) We define a fiber bundle as a space E for which the following are 
given: a base manifold B, a projection 7: E > B, a typical fiber F’, a structure 
group G of homeomorphisms of F onto itself, and a family {U;} of open sets 
covering B (i.e. open sets whose union is B), all of which satisfy the following 
restrictions. 

(i) Locally the bundle is trivial, which means that the bundle over any set U;, 
which is just 7"'(U;), has a homeomorphism onto the product space U; x F. We 
have noted this above. Part of this homeomorphism is a homeomorphism from 
each fiber, say 7 '(x) where x is an element of B, onto F’. Let us call this map 
πα), labelled not only by the point x which defines the fiber but also by the 
index j which denotes the set U; containing x. 

(ii) When two sets U; and U;, overlap, a given point x in their intersection 
has two homeomorphisms h(x) and /;,(x) from its fiber onto ΜΕ. Since a 


Fig. 2.16. The Mobius-band version of this bundle: the fibers turn over 
once as one follows them around the circle. Locally it still has the same 
structure as figure 2.11(b). 


A homeomorphism is a diffeomorphism without the differentiability requirement. 
For most of the bundles of physical interest, one can read ‘diffeomorphisms’ for 
‘homeomorphisms’. 
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homeomorphism is invertible, the map h,(x) © hj' (x) is a homeomorphism of F 
onto F. This is required to be an element of the structure group G. 

The second restriction contains the information about the global structure of 
the fiber bundle. To see how this works, we first give the complete definition of 
TS' (which has a straightforward generalization to TM for any M). The bundle 
ΕΞ TS" has base B = S', typical fiber F = R*, and projection π: (x, 0) + x, 
where x is a point of S' and 0 is a vector in 7,,. Let the covering {U;} be the 
open sets of any atlas of S’. A typical family {U;} is illustrated in figure 2.17. 
Every U; has a coordinate ‘system’, i.e. a parameterization of S' , which we will 
call \;. The vector d/da; at x in U; is a basis for 7;,, so any vector 0 in 7;, has 
the representation a;)d/da, for any fixed/, where aj) isa real number. This is just 
equation (2.4) again. The homeomorphisms of 1, onto R which are part of the 
definition of TS" are defined to be h,(x): δ aj. If x is in two neighborhoods 
U; and U;, there are two such homeomorphisms from 7,, onto R, and since ); 
and d;, are unrelated, a,;) and αγ) can be any two nonzero real numbers. The 
homeomorphism h,(x) © hz, x): F > F maps αι ΓΣ αι) and is therefore just 
multiplication by the number 7;, = a(;)/a(,). Since 7;, is any real number other 
than zero, the structure group is R' — {0}, which is a group under multipli- 
cation, a Lie group in fact. We note in passing that for an n-dimensional mani- 
fold M, the structure group of TM is the set of all n x n matrices with nonzero 
determinant, which is called GZ(n, R). We will study this group in chapter 3. 

This defines TS‘. But what does it look like? It is possible to choose the co- 
ordinates A; in such a way that any two, say A; and A,, increase in the same 
direction in S’ in the region where U; and U;, overlap. (We say that S* is orient- 
able; see §4.7.) With such a choice of coordinates it is not hard to see that all 
the ‘overlap numbers’ r;, are positive, and the structure group reduces to R’, 
multiplication by the positive real numbers. In fact we can do even better by 
scaling the coordinates in such a way that ἆλι/ἆλι = 1 in every overlap region. 
Then the group reduces to 1, the identity element. The structure group is trivial, 
and so is the bundle structure. This is the bundle represented in figure 2.14. 

To characterize the structure of the Mobius band we must use different maps 


Fig. 2.17. A set of neighborhoods of S' which cover S!. The extent 
of each neighborhood is indicated by the parentheses. U, overlaps U2, 
U, overlaps U3, and so on until Ug overlaps U,. 
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h(x), and we must be careful not to try to interpret the bundle as a tangent 


bundle. The easiest procedure is to use the family {U;,7=1,..., 8} shown in 
figure 2.17 and to define Γι} = 1,723 = 1,...,773 = 1. But then the twist in 
the Mobius band forces us to use rg; = — 1. The structure group consists of the 


elements {1, — 1} with multiplication as the group operation. We could have 
made other choices for the 7;,8, but we could not have found a smaller structure 
group. 

The tangent bundle 7S* had structure group R' — {0}, which is nearly the 
same as its typical fiber. The frame bundle of any manifold M has the same 
structure group as TM, but its fiber is the set of all bases for the tangent space 
(equivalently, for R”). In the case of a one-dimensional manifold like ιδ, this is 
the set of all nonzero vectors, which is identical to R' — {0}. So the frame 
bundle of S' has fibers homeomorphic to its structure group, and this is true of 
all frame bundles. Such a bundle is called a principal fiber bundle. 


2.12 Vector fields and integral curves 

As defined in §2.7, a vector field is a rule that gives a vector at every 
point of M. Each point has its own tangent vector space, so a vector field selects 
one vector from each space. Now, every curve has a tangent vector at every 
point, and the question arises of whether the converse is true: given an arbitrary 
vector field, is it possible to start at one point P and find a curve whose tangent 
vector is always the vector field at whatever point the curve passes through? The 
answer is yes, for C! vector fields, and such curves are called integral curves of 
the vector field. The proof is as follows. Let the components of the vector field 
be V'(P), functions of P. In some coordinate system {x!} we have V*(P) = v'(x’). 
The statement that this is a tangent vector to a curve with parameter λ is 

η 

ο = U(x’). (2.5) 
This is just a set of first-order ordinary differential equations for x'(A), and a 
unique solution always exists in some neighborhood of the initial point P. (This 
existence/uniqueness theorem for ordinary differential equations is proved in 
most textbooks on differential equations. A version may be found in Choquet- 
Bruhat, Dewitt-Morette & Dillard-Bleick (1977) in the bibliography.) Two 
particular vector fields are illustrated in figure 2.18. 

Notice that the paths of different integral curves can never cross except pos- 
sibly at a point where V* = 0 for all i, because of the uniqueness of solutions to 
(2.5). Since some integral curve passes through each point P (it is found by solv- 
ing (2.5) with initial conditions at P), the integral curves ‘fill’ M. For instance, if 
M is three-dimensional, then there is a two-dimensional family of integral curves 
for each vector field on M, and they cover all of M (except possibly isolated 
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points where V' = 0 for all 7). Such a manifold-filling set of curves is called a 


congruence. The set of curves, incidentally, can usually be regarded as a mani- 
fold itself. 


2.13 Exponentiation of the operator d/dA 

We now introduce an idea that will prove to be a useful tool in several 
subsequent calculations. Suppose we have an analytic manifold (C™ ), and the 
coordinate values x'(X) of points along the integral curves of Y = ά/ἀλ are 
analytic functions of A. Then the coordinates of two points with parameters po 
and A» + e€ are related by the Taylor series 


dx? 1 42 i 
x'(Ao + ε) = x'(Ag) + εν) rte x + 
Xo ° Ao 





dxf, 21 Vax 
d 1,@ | 
= [l+e—+—e?—+...]x 
| an 2° an? |. ' 
“| x! (2.6) 
= exple—|x'! , 
μπα. 








where the ‘exp’ notation is an obvious and convenient shorthand for the differ- 
ential operator which, when applied to x'(A) and evaluated at Ao, gives the 
Taylor series. It is called the exponentiation of the operator εά/ἀλ. Since ed/daA 
is an infinitesimal ‘motion’ along the integral curve, its exponentiation gives a 
finite motion. Other notations we will use include 


exp (ed/da) = οἑ αλ = ef”, 


2.14 Lie brackets and noncoordinate bases 
Given a coordinate system x’, it is often convenient to adopt {0/dx"} as 
a basis for vector fields. However, any linearly independent set of vector fields 


Fig. 2.18. Integral curves of two vector fields on R’. (a) V = x0/0y 
— yd/dx;(b) V = (x + y/r)d/ay — (vy — x/r)d0/dx, withr = (x? + y?)'. 


(a) (ϱ) 
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can serve as a basis, and one can easily show that not all of them are derivable 
from coordinate systems. This is because the operators 0/dx’ and 0/ax/ 
commute for all i,j. Two arbitrary vector fields do not commute: if V = d/da 
and W = 4/άμ. then 











i,j 
ow! ὃ .ov' a 
+ pa YS py — 
ὃ Ox" Ox! π ox! dx" 
owi _ avila 
= vyi—--wi'— ]—, 2.7 
5 | Ox’ aa 2.7) 


where the last line follows from relabelling the summation indices in the final 
sum of the middle quantity. Therefore, the commutator 


ἆ ἆ 
ποπ τπτ στ τπτ (2.8) 
ἀλ΄ du ἀλάμ ἂἆμ ἀλ 


is a vector field whose components do not vanish in general. If d/dA and d/dy are 
two elements of a basis, then they will not be expressible as derivatives with 
respect to any coordinates. Such a basis is a noncoordinate basis. 

It is important to realize that this distinction between coordinate and non- 
coordinate bases is one which can be made only over some region of the mani- 
fold, not at a single point. It depends on the derivatives of the components of 
the vectors, not just on their values at a point. The different properties of 
coordinate and noncoordinate bases therefore matter only over regions of a 
manifold, and are irrelevant in problems which involve only the tangent space 
Tp of a single point P. 


Exercise 2.1 
Show that the ‘unit’ basis vector fields for polar coordinates in the 
Euclidean plane, defined by 


f cos 0X + sin 6Y, 
ϐ = —sin 0% + cos OF, 


where x = 0/dx and y = 0/dy, are a noncoordinate basis. 
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The commutator [ά/άλ. d/du] is called the Lie bracket’ of V and W, and 
we now look at its geometrical interpretation. In figure 2.19 we have drawn a 
coordinate grid on a two-dimensional manifold. Notice that by definition x’ is 
constant along the lines of x*, which are the integral curves of 0/dx?. That is 
why 0/dx' and 0/dx* commute: each is a derivative along a line on which the 
other is fixed. Now consider two arbitrary vector fields, V = d/dA and W = d/du, 
whose integral curves are shown in figure 2.20. An integral curve of W is not 
necessarily a curve of constant A, and vice versa. The derivative d/dyu is not a 
derivative holding λ fixed, so d/dA and d/dy do not commute. Although the V 
and W curves look like coordinate curves, their parameterization is not that of a 
coordinate system. Even the fact that they look like coordinate curves is an 
artefact of two dimensions: in three dimensions it may happen that curve (1) 
intersects curves (a) and (8) but (2) intersects only (a). 


Fig. 2.19. Typical coordinate grid on a two-dimensional manifold. 





Fig. 2.20. Typical integral curves of two vector fields on a two- 
dimensional manifold. 





t The ‘Lie’ of Lie bracket is the same Lie as in Lie groups: Sophus Lie, the great 
mathematician of the late nineteenth century. The Lie bracket is, as we shall see, 
a special case of the Lie derivative. Readers who are familiar with Lie groups may 
recognize the Lie bracket as the commutator of the vector fields d/dA and d/du 
which generate a Lie group of mappings. We discuss these mappings in chapter 3. 
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We can obtain a picture of the vector [V, W] in the following manner. In 
figure 2.21, consider starting at P, moving Ad = e along the V curve through 
P, and then moving Ay = € along a W curve. One winds up at A. Starting again 
at P and going first Au = e and then AA = e¢, takes one to B #A. We shall show 
that the vector stretching from A to B is e? [V, W] , to lowest order in e. 

It is most convenient to use the exponentiation operator introduced earlier. 
It is clear that 


. d . 
i - i 
x'(R) exp ; a x 
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and 
x'(A) = exp - η exp |é πι xt, (2.9) 
du dr P 
Similarly, the path to point B from P gives us 
x(B) = exp 9 a exp |ε η x! (2.10) 
dr du Ρ 
Then the difference in the coordinates of A and B is 
x'(B)—x'(A) = [ο 9/44, ος HH] χὴν, (2.11) 


just the commutator of the exponentiation operators. Returning to the Taylor 
series, we can write ; 


d 1 d 
[ος HAR, ge uy — ... +0), 


d d? 
bettie Soe 
µ μ 


x'(B)—x'(A) = e? [V, W] + Ο(εξ ). (2.12) 


Fig. 2.21. Geometric interpretation of the Lie bracket [V, W] as the 
open part of an incomplete parallelogram whose other sides are equal 
parameter increments along integral curves of V and W. 
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This is just the ith component of the Lie bracket, and (2.12) justifies the picture 
we have given for it. 


Exercise 2.2 

(a) Use (2.6) to prove (2.12). 

(b) Prove that 

exp [ad/dA + bd/du] = exp [αά/άλ] exp [bd/du] (2.13) 
for all a and b if and only if [d/da, d/du] = 0. 


Exercise 2.3 
Prove that any three twice-differentiable (1.9. (2) vector fields X, Y and 
Z satisfy the Jacobi identity 


[LX, Y],Z] + [[¥,Z], X] + [[Z,X], Y] = 0. (2.14) 


A Lie algebra of vector fields on a region U of M is aset A of vector fields on 
U which is a vector space under addition (which means any linear combination 
with constant coefficients of fields in A is a field in A) and which is closed under 
the Lie-bracket operation (the Lie bracket of any two fields in A is another field 
in A). Clearly, the set of all 6 vector fields on Uis a Lie algebra, but it is more 
interesting when a smaller set of vector fields singled out for some reason also 
forms a Lie algebra. These are closely related to the invariance properties of 
manifolds and to their associated invariance groups, which are usually Lie 
groups. We shall study this in greater detail in chapter 3, where we will also pre- 
sent a more general definition of a Lie algebra. 


2.15 When ts a basis a coordinate basis? 

Suppose we are given two vector fields A = d/dA\ and B = d/du ona 
two-dimensional manifold M, and suppose that A and B are linearly independent 
at every point of some open neighborhood U of M, so that they form a basis for 
vector fields there. What condition would assure us that they are a coordinate 
basis, in other words that λ and w are coordinates for U? It is clearly necessary 
that they commute 

(4, Β] = 0. 

We shall show that this condition is sufficient as well. To do this we go right 
back to the basic definition of a manifold: we construct a 1-1 map from U onto 
a neighborhood in R*. Beginning at some point P in U, and using arbitrary co- 
ordinates (x!, x?) in U, we move a parameter distance A, from P along A toa 
point R whose coordinates are (by equation (2.6)) 
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x(R) — ei d/dv x'|p. 
If we go first a distance λι along A, then µι along B, we get to a point Ο with 
coordinates 


x'(Q) -- elt Adu gd, d/dA yi) | 


This equation defines an exponential-type map from some neighborhood V of 
the origin of R* into U: a given element of V, the pair (A, , μι), is mapped to the 
point Q. This map is illustrated in figure 2.22. In order for this map to define a 
coordinate system, it must be 1-1: it must have an inverse. We show below that 
it does have an inverse everywhere in U, but first we shall show that A and B are 
the coordinate basis vectors of this coordinate system if they commute in this 
neighborhood. Let us rewrite the map as the coordinate transformation from 
ία, B} to {x*, x?}: 

xi(a, B) = Palau god/ar yi 
The basis vectors 0/da and 0/06 have components (in the {x'! coordinate 
system) dx'/da and dx'/dB, respectively. It is easy to show from (2.6) that 


do ead/dA _ ,ad/dr 4 
da dr’ 


and since d/du and d/dA commute, we obtain 
da ᾱλ 
ὃν ος balay goasan OX 
Op du 

But dx'/d) is just the component of d/d) in the {x"} coordinate system. Since 

this is an analytic function of M, operating on it with exp (βά/ἀμ) :« exp (αά/άλ) 


i i 
Ox — efd/du god/dr dx 





» 
Ρ 


Ρ 





Fig. 2.22. The map from Κ2 to M described in the téxt. This provides 
a coordinate system in some neighborhood of P. 
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simply produces its value at the point whose coordinates are (a, 8). Therefore 
we have everywhere in U 
9/ὃα = d/dd and δ/9β = d/dy, 

and we have proved the sufficiency of [A, B] = 0 asa condition that A and B be 
coordinate basis vectors. 

We return now to the deferred proof that {a, 6} do form a coordinate system 
in U. We must prove that the map {a, 6} > {x'} has an inverse, and for this we 
use the inverse function theorem (see $1.2). This says that if the matrix 


ox! ax? 
da da 
ax' dx? 
06 8B 


has a nonzero determinant at some point {a, 8}, then the map has an inverse in 
some neighborhood of this point. The determinant will vanish if and only if the 
vectors 0x'/da, ax'/0B are linearly dependent, but from the above discussion it is 
clear that this will never happen because A and B are linearly independent in U. 
Therefore, everywhere in U the map is invertible and provides a coordinate 
system. 

It is interesting to ask where this argument breaks down if [A, B] #0. !n this 
case the expression dx'/dB is more complicated. It is still true that, at least in 
some neighborhood of a = 8 = 0, the map has an inverse. But because dx'/08 is 
no longer just dx’/dy at the point in question, the vectors 4 and B are not the 
basis vectors of the constructed coordinates. 

The whole argument extends to n dimensions: if n vector fields {Y;;y, 
j=1,...,n} onan n-dimensional manifold M are linearly independent and 
commute with one another in some open region U of M, then they are the co- 
ordinate basis vectors of the coordinate system {a;}, given in terms of an arbi- 
trary system {x/} by 

x'(Q1,...5,%,) = αν] » “Fay x; 
J 
centred at an arbitrary point P in U. 


2 





2.16 One-forms 

Let us go back to Tp, the space of all tangent vectors at P. As a first 
step towards tensors, we define a one-form as a linear, real-valued function of 
vectors. This means the following: a one-form @ at P associates with a vector V 
at P a real number, which we call 6(V). This notation expresses the idea that ὢ 
is a function on vectors. (A tilde (~) over a letter always denotes a one-form, 
just as a bar (”) denotes a vector.) The linearity of this function means 
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S(aV + bW) = a&d(V) + b&(W), (2.15) 
where a and b are real numbers. We can define addition of one-forms and their 
multiplication by real numbers in a straightforward way: ad) is the one-form 
such that 


(a&)(V) = α[ῶ(γ)] (2.16a) 
for all V, and & + @ is the one-form such that 
(O+G)V) = 6(V)+A(V) (2.16b) 


for all V. Thus one-forms at the point P satisfy the axioms of a vector space, 
which is called the dual vector to Tp, and is denoted by 7p. The reason 

it is ‘dual’ is that vectors can also be regarded as linear, real-valued functions of 
one-forms, in the following manner. Given a vector V, its value on any one-form 
ὤ is defined as 63(V ). This is linear, since its value on αῶ + b@ is, by (2.16) 
above, 


(αῶ + b&)(V) = (αῶ) (1) + (&)(V) 


a(value of V on 6) + b(value of V on 6). (2.17) 


It is thus the linearity property which enables us to regard each as a function 
taking the other as argument and producing a real number; vectors and one- 
forms are thus said to be dual to each other. Their value on one another is often 
represented in many ways: 

o OV) = γ(ῶ) = (ῶ. 0), (2.18) 
where the last expression emphasizes their equal status. The formation of the 
number @(V) is often called the contraction of & with V. In older treatments 
of tensor algebra, vectors are often called ‘contravariant vectors’ and one-forms 
‘covariant vectors’. These names refer to the behavior of their components under 
a change of basis, which is something we will deal with in §2.26. 


2.17 Examples of one-forms 

Before going further with the mathematical development, let us look at 
some familiar examples of one-forms. One of the most common is the gradient 
of a function, which will be discussed in §2.19. Other examples include the 
following: 

(i) In matrix algebra, if we call column vectors ‘vectors’, then row vectors are 
one-forms. This is because when multiplied (in the correct order) by the usual 
rules of matrix multiplication, they give a single real number. For example, in 
the two-dimensional case the row vector (— 1, 5) may be thought of as a func- 
tion which takes an arbitrary column vector into a real number: 


(— 1,5): κ... = —xt 5y. 
JY y 
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That this function is linear is easily checked. 

(ii) In the Hilbert spaces used in quantum mechanics, the analogues of 
example (i) are Dirac kets |W) (vectors) and bras (| (one-forms), whose con- 
traction is (|), a complex number. (The generalization of vector and tensor 
algebra to algebras over the complex numbers rather than the reals is trivial: one 
just replaces the word ‘real’ by ‘complex’. In many ways, the generalization of 
our real manifolds to complex-analytic ones — where the maps are analytic maps 
to the space (z1, z”,...,2”) for complex rather than real z' — is also simple. 
But some features of complex manifolds, such as their global structure and 
curvature, present special problems, which we cannot treat in this book.) The 
notation (¢|W) is, not accidentally, similar to (2.18). 

In both examples (i) and (ii) one is used to switching between the vectors 
and one-forms, associating with a given vector its ‘conjugate’ or ‘transpose’, 
which is a one-form. We shall see in §2.29 below that this is equivalent to giving 
a metric or inner product to the vector space. This is a very important additional 
structure in a vector space, but the reader should bear in mind that there is no 
a priori, ‘natural’ way of associating a particular one-form with a particular 
vector. 


218 The Dirac delta function 

In quantum mechanics one often deals with function spaces. Consider 
the set C[— 1, 1] ofall C™ real-valued functions defined on the interval — 1 <x 
<1 of κ. This set is a group under addition (the sum of any two C™ functions 
ἰδς , etc.) and a vector space under multiplication by real constants (if f is a 
C™ function, so is cf for any constant c). Its dual space of one-forms is called 
the distributions. An example of a distribution is the Dirac delta ‘function’ §(x), 
which is defined as that one-form whose value on aC” function f(x) is f(0): 


(6(x), f(x)) = f(0). (2.19) 
In one sense 6(x) is a true function: it is a mapping C[— 1, 1] > R. It is custom- 
ary to apply the word ‘distribution’ only to a continuous function of this sort. 
But any notion of continuity requires a topology, in this case a topology for 
C[- 1, 1]. This is an infinite-dimensional vector space (there are an infinite 
number of linearly independent C™ functions), and a discussion of its topology 
is well outside the scope of this book. The interested reader can consult 
Choquet-Bruhat et al. (1977). What is important for one to understand at 
present is that this sense of function is not what Dirac and his contemporaries 
meant when they called d(x) the delta function. To see what they had in mind 
we have to look again at a way of transforming a function in C[— 1, 1] into a 
one-form on C[— 1, 1]. 
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For any function g in C[— 1, 1] it is possible to define a one-form % whose 
value on a function fin C[— 1, 1] is 


@ f= | gle) fe)ax. (2.20) 


This is indeed a linear function mapping f to the integral’s value. (Since g and f 
are continuous on— 1 <x <1, they are bounded there and the integral always 
exists.) The name ‘delta function’ was used as a loose way of turning this 
relation around: if 5() is a one-form, then one ought to be able to talk about it 
as a function of x in the ordinary sense whose integral with f(x) produced f(0): 


[ 52/0) & = £0) 


This idea caused great distress to mathematicians, some of whom even declared 
that Dirac was wrong despite the fact that he kept getting consistent and useful 
results. Wisely, the physicists rejected these extreme criticisms and followed 
their intuition. We can now see why they were ‘wrong’ and still succeeded. They 
were ‘wrong’ because they spoke of 5(x) as a function R' > R!, which it cannot 
be in any precise sense, and because they treated it as a function by integrating 
it and even differentiating it: 


μμ = -[ 5(x) f'(x) dx = —f'(0). 


But they were ‘right’ because they never used 5(x) outside integrals with 
sufficiently-differentiable functions f(x): they never used it except to map 
functions to real numbers. In this sense they employed the machinery but not 
the words of distribution theory, which was devised expressly in order to give 
delta functions a sound basis. Notice, however, that distribution theory has one 
big simplification over the older physicists’ view: it can define the delta function 
without referring to any mule like (2.20) for turning a function into a one-form. 
As remarked in example (ii) above, such a rule is an extra structure on a vector 
space, which we now see is unnecessary for understanding delta functions. 

We should remark in passing that we restricted the word ‘distribution’ to con- 
tinuous one-forms, in keeping with the usual practice in defining the dual of any 
vector space. However, we did not include the word ‘continuous’ in our defini- 
tion of one-forms in 952.16: have we been inconsistent? The answer is no, because 
on a finite-dimensional vector space a linear function is always continuous. (See 
Cho quet-Bruhat et αἰ. (1977) or Rudin (1964), in the bibliography of chapter 1.) 


2.19 The gradient and the pictorial representation of a one-form 
A field of one-forms, by analogy with vector fields, is a rule giving a 
one-form at every point. The rules in equation (2.16) extend to fields; in this 
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case, a is a function on M, not necessarily constant. Differentiability of one- 
form fields can be defined in terms of that of vector fields and functions. For 
example, on aC” manifold, a given one-form field @ defines, when supplied 
with a vector field V, a function G(V). If this function is ο for any C™ V then 
& is C™. (We will give an easier definition of differentiability after defining 
components of one-forms in 92.20.) As with vector fields, there is a fiber bundle 
called the cotangent bundle T*M with M as base and 7*p as the fiber over the 
point P. Cross-sections of 7*M are one-form fields. 

A most useful and instructive one-form field is the gradient of a function f, 
which we denote by df. Although elementary treatments of vector calculus call 
the gradient a vector, it is properly a one-form. Thus, the gradient df (not the 
‘infinitesimal’ df, which we rarely use)! is defined by 
ϕ df(d/da) = df/da, (2.21) 
where d/dA is an arbitrary tangent vector. That is, the gradient of f at any point 
P is that element of T*p whose value on an element V of Tp is the directional 
derivative of f along a curve whose tangent is V. We must check that this is a 
linear function on ΤΡ. in the sense of equation (2.15): 


/ 
ied +02) - ων]; 


dv du da du 
d d 
=a of + b of 
dv du 


a df(d/dd) + b df(d/du). 
So it is indeed linear. At first thought it might seem that fitself should be the 
one-form, since f and ά4[ἀλ make df/dA, a number. But this is not right; the 
reader is reminded that both Tp and 7“ p are defined at a point P, so all the 
information needed to construct df/dA must be present there. The value of f at 
Pis irrelevant to df/dA. To compute df/dd at P one needs to know 0//0dx’ at P. 
These are, as we shall see, the components of the gradient of f. So it is the 
gradient which is the one-form. 

The gradient enables us to develop a picture of a one-form, complementary 
to the picture of a vector as an arrow. In figure 2.23 we have drawn part of a 
topographical map, showing contours of equal elevation. If h is the elevation, 
then the gradient dh is clearly largest in an area like A, where the lines are closest 
together, and smallest near B, where the lines are spaced far apart. Moreover, 
suppose one wanted to know how much elevation a (short) walk between two 
points would involve. One can lay out on the map a line (vector AX) between 


t For a nice discussion of the relation of af to the infinitesimal, see Spivak (1970), 
vol. 1. 
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the points. Then the number of contours the line crosses gives the change in 
elevation. For example, line 1 crosses 14 contours, while 2 crosses 2 contours. 
Line 3 starts near 2 but goes in a different direction, winding up only 4 a con- 
tour higher. But these numbers are just Ah, which is a linear function of dh and 
Ax: 


This is the value of dh on Ax (cf. equation (2.21) above and (2.27) below). 
Therefore, a one-form @ may be represented by a series of surfaces (figure 
2.24), and its contraction with a vector V is the number of surfaces V crosses. 
The closer the surfaces, the larger 6. Properly, just as a vector is straight, the 
one-form’s surfaces are straight and parallel. This is because we deal with one- 
forms at a point, not over an extended region: ‘tangent’ one-forms, in the same 
sense as tangent vectors. 

These pictures show why one in general cannot call a gradient a vector. One 
would like to identify the vector gradient as that vector pointing ‘up’ the slope, 
i.e. in such a way that it crosses the greatest number of contours per unit length. 
The key phrase is ‘per unit length’. If there is a measure of distance on the mani- 
fold, then a vector can be associated with a gradient. But if one does not know 
how to compare the lengths of vectors that point in different directions, one 
cannot define a direction of steepest ascent, and the gradient is fundamentally 
different from a vector. Since we shall not assume a length (or ‘metric’) in 


Fig. 2.23. A topographical map of a hilly region. Curves are contours 
of equal elevation above sea level. Arrows indicate possible paths for 
a walker. 





Fig. 2.24. A ‘tangent’ one-form @ represented pictorially as a series of 
parallel surfaces of dimension one less than that of the manifold. The 
number pierced by a vector V is the contraction (@, V). 
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general, we must preserve the distinction between vectors and one-forms. We 
will return to this point in 52.29. 


2.20 Basis one-forms and components of one-forms 

In the vector space of one-forms atP; T*p, any n linearly indepen- 
dent one-forms constitute a basis. However, once a basis {é;,i= 1,...,n}has 
been chosen for the vectors Tp at P, this induces a preferred basis for T*p, called 
the dual basis {G3',i=1,...,.n}. It is defined as follows. If V is any vector in Tp 
then ὤἱ produces the ith component of V 

aV) = V'. (2.22) 
It is easy to see that this is linear in the argument V, since the ith component of, 
say, V+ Wis V' + W'. So (2.22) indeed defines a linear function on ΤΡ. In 
particular, since the basis vector 6; has only a jth component, all others vanish- 
ing, we have 
+ ai) = 84. (2.23) 
This is the definition of @' found in most references. Note carefully that in 
order to define any G' all the vectors {é;} must be known. A change in any one 
é, generally changes al/ the basis one-forms @'. The correspondence we have 
established is between one basis and its dual, not between an individual vector 
and an associated one-form. 

We have not actually proved that the {@"} are linearly independent and there- 

fore do form a basis. This follows easily from (2.23), but we will use a more 
indirect approach. Consider any one-form @ acting on an arbitrary vector V, 


q(V) = ασ ve) 


», Via) 


> &(V)GE)). (2.24) 


j 
The numbers 


αι = F@;) (2.25) 
are called the components of ᾷ on the basis dual to {é;}. To see that this name 
is more than a mere analogy with (2.22), we rewrite (2.24) as 


q(V) = » q;@'(V). 


Since a one-form is defined by its values on vectors, it follows from this, equa- 
tion (2.16), and the fact that V is arbitrary, that 
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ϕ ἃ = > οῶ). (2.26) 
J 


This shows that the set {d/} is indeed a basis, since there are only n of them and 
any ᾷ is a linear combination of them. It also shows that the numbers {q;} are 
indeed the components of 7 on this basis in the ordinary sense. 

Most importantly, we now have a formula giving us the value of 9(V ) if we 
know the components of g and of V: 


q(V) = % αιγ’. (2.27) 


As remarked before, this is the contraction of V and g. 

Naturally, all these considerations extend directly to one-form fields. If the 
set of vector fields {61 is a basis at every point in some region U of M, then the 
fields {G!} defined by (2.23) are likewise a basis at all points of U. A coordinate 
system on U, {x}, defines a natural basis for vector fields {0/dx’?. It also defines 
a natural set of n one-forms, the gradients {dx'}. These one-forms are in fact the 
basis dual to the coordinate basis vectors: by equation (2.21) 

dx'(a/ax/) = dxi/ax! = δἱ,, (2.28) 
the second equality following from the ordinary properties of partial derivatives. 

In §2.19 we defined differentiability of one-form fields. It is now easy to 


prove that J is aC” one-form field if only if its components {q’} associated with 
aC basis for vector fields are C™ functions. 


2.21 Index notation 

We adopt the following conventions for the use of indices. Components of 
vectors, e.g. V', have the index written as a superscript; components of one- 
forms, e.g. w,, have subscripted indices. Members of a vector basis are labelled 
with subscripts (2;), those of a one-form basis with superscripts (@/). (For 
coordinate bases, this rule means that the one-forms dx! have their index up, 
as they should, while the vectors 0 /ax! are considered to have their index down, 
since it appears in the denominator as a superscript.) These conventions are 
adopted for a good reason. Consider the contraction 


OV) = δ, Vio, 
j 


which is a sum of products in which one multiplier has a raised index and the 
other a lowered one. We shall adopt the Einstein summation convention: 
whenever an expression contains a repeated index, once as subscript and once 
as superscript, a summation over the index is understood. Thus, in the ex- 
pressions 
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G=wud'i, P= vi, OV) = Vio, 
x 

summations are understood. In the expressions 

Viw®, γω, = VW), 
there are no summatiohs; in the first two there are no repeated indices, and in 
the last both are raised. Use of the summation convention greatly simplifies 
calculations in which components are used, and our rules for the placement of 
indices minimize the possibility that the convention will lead us into careless 
errors. 

We are now in a position to extend our treatment of vector algebra to tensors. 


2.22 Tensors and tensor fields 

Tensors are a natural extension of the concepts we have already 
developed. Their algebra is straightforward and they have, as we shall see, many 
uses. The principal problem students have when they first encounter tensors is 
that they cannot ‘visualize’ them: they have no picture. We have earlier devel- 
oped pictorial ways of representing vectors and one-forms; this can to some 
extent be extended to tensors of higher type, but the pictures rapidly become 
very complicated. It is perhaps better to avoid picturing most tensors directly, 
and to think of them in terms of the definition we shall now give, as linear 
operators on vectors and one-forms. 

Consider a point P of M. A tensor of type (’) at P is defined to be a linear 
function which takes as arguments NV one-forms and NV’ vectors and whose value 
is a real number. This is a generalization of the way we defined one-forms. By 
‘linear’ we understand linearity on every argument (usually called multi- 
linearity). For example, if F is a (2) tensor then its value on the one-forms & 
and @ and the vectors V and W is 

F(,6;V, W). 
As a linear function it obeys (for arbitrary numbers a, b) 
9 Ε(αῶ -Εδλ. δ: Ρ, W) = αξ(ῶ, δ: Ρ, W)+ DFO, 3 J, W), (2.29) 
and similarly for the other arguments. If we want to speak of F without naming 
its arguments, we may sometimes write F( , ; , ), in which empty spaces 
signify ‘slots’ into which any arguments of the appropriate type (one-forms 
before the semicolon, vectors after) may be placed. Naturally, the order of the 
arguments generally makes a difference, as is true of functions of real variables. 
(That is, the function f(x, y) = 3x + Sy has different values for f(1, 2) and 
(2, 1).) 


As with vectors and one-forms, a (9) tensor field is a rule giving a (9) 
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tensor at each point. Linearity extends to tensor fields, where the numbers a 
and b in equation (2.29) can have different values at each point: they are func- 
tions on M. Differentiability of the field is defined as for one-forms, $2.19. 

As a special case, note that vectors are tensors of type (0): they are linear 
functions of one-forms. Similarly, one-forms are tensors of type (°). By con- 
vention, a scalar function on the manifold is taken to be a tensor of type (0). 
(See §2.28 on ‘Functions and scalars’ below.) A (1) tensor T requires two 
arguments. Thus, Τ(ῶ: V) is a real number; for fixed ὤ, T(@; ) is a one-form, 
since it needs a vector argument to give a real number; T( ; V) isa vector. So, 
a (1) tensor in particular can be thought of as a linear vector-valued function of 
vectors, and also as a linear form-valued function of one-forms. This game can 
be played with any tensor. 


2.23. Examples of tensors 

Although our definition of a tensor may seem rather abstract, it is in 
fact quite often very directly applicable to common problems. We mention 
three examples immediately, and later (in 52.29) devote some time to a dis- 
cussion of a very important tensor, the metric tensor. 

(i) We take our first example from matrix algebra. If column vectors are 
vectors and row vectors are one-forms, then a matrix is a (1) tensor, since multi- 
plying it by a vector gives a vector, and letting it operate on both in the usual 
way gives a number. 


Exercise 2.4 

A linear (‘active’) transformation in matrix algebra (e.g. an orthogonal 
rotation) transforms one matrix into another. Show that it is therefore 
a (2) tensor when operating on matrices. 


(ii) The second example is from the function space C[— 1, 1] mentioned in 
§ 2.18. A linear differential operator (e.g. x? d/dx) converts functions (‘vectors’ 
in this space) into other functions (vectors). Being linear, it is therefore also a 
(1) tensor in the space. 

(iii) The third example is the stress tensor. Readers familiar with continuum 
mechanics will know the stress tensor. Given a stressed material, and given an 
imaginary plane passing through the material, the stress tensor gives the stress 
vector across that plane (the force per unit area exerted by the material on one 
side of the plane upon that on the other side). Now, a plane is a surface, and a 
surface is represented by a one-form. The stress tensor turns out to be a linear, 
vector-valued function of one-forms, or a (2) tensor. 
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2.24 Components of tensors and the outer product 
A simple (6) tensor is the following: given two vectors V and W, we 

form a tensor called V 6) Wwhose value on two one-forms ὗ and J is the product 
ΥΠ): 
9 V @ Wd) = ΥΦΙ(Ω. (2.30) 
The operation © is called the ‘outer product’, ‘direct product , or ‘tensor prod- 
uct’. Its generalization to arbitrary numbers and types of tensors is obvious. 
The outer product of a (4y) tensor with a (Λη tensor is a tensor of type (4 ΑΝ. 

The components of a tensor are its values when it takes basis vectors and one- 
forms as arguments. If 5 is a (2) tensor, then it has components on a basis {e;} 


ϕ ου = ς(ῶ), &!, "3, δν). (2.31) 
If the order of the arguments of 5 matters, then so does the order of the indices 
of Sur . 


The extension to components of tensor fields and their differentiability is 
exactly as for one-forms in 52.20. 


Exercise 2.5 

(a) Prove that a general (0) tensor cannot be expressed as a simple outer 
product of two vectors. (Hint: count the number of components a 
(2) tensor may have.) 

(b) Prove that the (1) tensor V ® ὤ has components V'w,. 


Exercise 2.6 

Prove that the set of all (3) tensors at P is a vector space under addition 
defined by analogy with equation (2.160). Show that e; @ e; is a basis 
for that space. (Thus, although a general (2) tensor is not a simple outer 
product, it can be represented as a sum of such tensors.) This vector 
space is called Tp © ΤΡ. 


2.25 Contraction 
In exercise 2.5, we point out that the set {V’w;} are components of a 

(1) tensor. Now, by summing on the index, one gets yi (oj, a number indepen- 
dent of the basis, the value of ὤ on V, which may be thought of as a (9) tensor. 
By analogy one can show that if Sip and P'™ are components, respectively, of 2 
(5) and (2) tensor, then S';,, P”” is a component of a (3) tensor, ο P’” of a 
({) tensor, Sip P" of a different ({) tensor, etc. By analogy with equation 
(2.27), this operation is called contraction, and produces new tensors. 

We can give a short proof of the fact that contraction is independent of the 
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basis used. Consider the (2) tensor A, the (2) tensor B, and their contraction 
(in some basis) A” B;,. We claim that these are the components of a (1) tensor 
C, such that for arbitrary vector V and one-form 


σ(σ; V) —= (x AB, ow = » AG; G)BG;, V). 
j j 
By linearity on A’s second argument we can write this as 
ο(σ. V) = ale > BE, ra), 
j 


since the quantities B(@;, V) are just numbers. But in §2.20 we proved in effect 
that, independent of the basis, 


», BE, V)a' = BC, 7), 


which is a one-form since (for fixed V) it requires a vector as an argument (in 
the empty slot). This one-form occupies one of the slots in A, so we have proved 


AB, = Ci, Φ C@;V) = AG,B( .7)), 


independent of any basis (cf. exercise 2.8). 


Exercise 2.7 

How many different (7) tensors may be made by contraction on pairs 
of indices of the (3) tensor ΟΥ 31113 How many (6) tensors by a second 
contraction? 


Exercise 2.8 

Let A and B be two (1) tensors, and regard them as vector-valued linear 
functions of vectors: if V is a vector then A(V) and B(V ) are vectors. 
Show that if we define C(V) to be 


c(V) = B(A(Y)), 

then C is a (1) tensor as well. Show that its components are 
ae 

Cc; κ. B',A j° 

Discuss the relation of this with the linear transformation defined in 

51.6. 


2.26 Basis transformations 

The behavior of a tensor’s components under a change of basis is at the 
heart of the older definition of a tensor. It has been replaced more recently by 
the definition we have used here in terms of linear functions, and it is a measure 
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of how conceptually different these two approaches are that we are only now 
getting around to looking at basis transformations. This is not to say that these 
transformations are unimportant. Most practical calculations involving tensors 
involve working with their components, and an understanding of their trans- 
formation properties is essential. 

We shall consider vectors and tensors defined at some point P of M. Suppose 
we begin with a vector basis {é;,/ = 1,...,}and wish instead to use a basis 
{é,7' =1,..., 2}. (We shall use primes on the indices as our only way of 
distinguishing references to one basis from references to the other.) Then in Tp 
there is a linear transformation A from the old basis to the new: 

er = Δε. (2.32) 
The matrix A’, is nonsingular (otherwise {2,'} would not be linearly indepen- 
dent) but otherwise arbitrary. It is not the collection of components of some 
tensor, since its indices refer to two different bases. It is simply called the trans- 


formation matrix. 
The old one-form basis satisfies (2.23): 


63'(é,) = δἳ, . 
Multiplying by ΛΑ j' and using (2.32) and linearity gives 


OE) = 5, A® = Abe. (2.33) 
Now the matrix Ai; has an inverse, which we will define to be AP: 
APA = 6, AP At = 64, (2.34) 


Multiplying (2.33) by A® , gives 
ΛΑ OG) = δὲ, 
By comparing this with (2.23) 
9 ak = AF ey, (2.35) 
This is the counterpart of (2.32): basis one-forms transform oppositely to basis 
vectors (i.e. using the inverse transformation matrix) in order to satisfy (2.23) on 
both bases. 
It is now a simple matter of transform components: 
Vi = (VV) = N,V) = Αν], (2.36) 
αν’ = TEx) = UN ye) = VeIE) = Λόναι, (2.37) 
and similarly for tensors of higher type (cf. exercise 2.9 below). These trans- 
formation laws show that the components of vectors and the basis one-forms 
obey the same law, which is opposite (i.e. uses the matrix inverse) to the law 
obeyed by components of one-forms and the basis vectors. This is reasonable, 
in order to keep such sums as Ve, V'0,, etc. independent of basis. This illus- 
trates another convenience introduced by our positioning of indices and our 


| 
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summation convention: the position of an index automatically gives its trans- 
formation law. For example, V‘ and &/ obey the same law, which is 

vi = AV 
It could not use the matrix Ati, because the summation must be on unprimed 
indices and must involve one index which is up and one which is down. 

These opposing transformation laws gave rise to the old names, ‘contra- 
variant’ and ‘covariant’. What we call a vector was called contravariant because 
its components obey the law opposite (‘contra’) to the law governing the basis 
vectors. Similarly, one-forms were “covariant vectors’ because their components 
go with the basis vectors. The modern viewpoint emphasizes the fact that neither 
the vector nor the one-form is in fact changed by a basis transformation: they 
are coordinate-independent geometrical objects. Therefore, modern terminology 
has dropped the old names because they over-emphasize the coordinate- 
dependent descriptions of these objects. 


Exercise 2.9 
Show that a (2) tensor’s components transform as two vectors, i.e. 


Tit = Ai, Ai 7, (2.38) 
Generalize this to type (4). 


Exercise 2.10 

Show that if a tensor’s components are all zero in one basis, they are 
zero in all bases. (We then say the tensor is zero. It follows that if two 
tensors have equal components in one basis they are equal in all, and 
the tensors are said to be equal.) 


Exercise 2.11 

Associated with a particular basis {@;} of a vector space of dimension 
n, we are given some set of numbers (41, i,j=1,...,n}. We define 
another set of numbers Ai, = At Al, A! | and call them the compo- 
nents of the ‘tensor’ A on the new basis {é;' 1. Show that this ‘tensor’ 
is indeed a tensor as we have defined it. This shows that one can take 
the point of view that a tensor is the collection {4111 transforming in 
the given way. This is an alternative definition to the one we have 
used. 


It is of particular interest to look at these basis transformations when they 
result from coordinate transformations, which were mentioned briefly at the 
end of §2.6. Suppose a region U of the manifold M has a coordinate system 
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fx i=1,...,m}, and that we introduce new functions (νε, i=1,...,n} 
given by the equations 
yi = fi@l,...,x”), i = 1,...,n, (2.39) 


which can be summarized as yi =f U(x) ). These equations constitute a coordi- 
nate transformation if the Jacobian matrix of partial derivatives ay! / dx! has 

a nonvanishing determinant in U. A given point P in U can be described by two 
different sets of numbers, {x'} or {γή }. At P we likewise have two different 
coordinate vector bases, {0/ Ox! hand, by the chain rule of calculus, 


0 ax’ ὃ 





—~_ = TT TY. 2.40 
dy' ὃὂγ' ax? 2.49) 
By comparing this with (2.32) we learn 
. Ox! 
Ain = a (2.41) 
M 
Similarly, the inverse matrix is 
’ ὃν 
ΛΑ; = πα (2.42) 
which is easily proved using the chain rule for partial derivatives: 
ax! ay? -- δα. _— δὲ, 
dy? ax® ~— ax? 


It is important to understand that (2.42) defines only a restricted class of 
transformation fields AF, in U. At any one point Pin U one can choose all 
n* elements of AF, arbitrarily (apart from the requirement that its determinant 
should not vanish), but not so in the neighborhood of P, because (2.42) implies 


AF (ax! = aA® ax, (2.43) 
a symmetry that an arbitrary field AF, certainly would not need to satisfy. This 
is another illustration that not every field of basis vectors is a coordinate basis. 


2.27 Tensor operations on components 

Given a tensor T and its components {7 a j...} on some basis, suppose 
one multiplies each component by the number a, thereby obtaining {aT . i...) 
These are clearly components of the tensor aT, and this means that the oper- 
ation of multiplying all the components of T by a is basis-invariant: had we 
begun in coordinates fy we would have obtained the components of aT in these 
new coordinates. (One could not say the same had we multiplied only some of 
{T'--;_ \by a.) Thus, the operation «Τί, }> {aT*;_ } uniquely corresponds 
to the basis-independent statement T > aT. Similarly, the outer product of two 
tensors, 
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A,B>A®B, 
has the unique component analogue (cf. exercise 2.4) 


αμ. 


independently of what coordinate or noncoordinate bases are used. In general, 
an operation on components that produces components of the same tensor 
independently of the basis is called a tensor operation, and we will deal exclu- 
sively with them. The following list is a summary of the algebraic tensor oper- 
ations (we shall consider ones involving differentiation later): 


(i) Addition (and subtraction) of components of tensors of the same type. 
(ii) Multiplication of all components by a number gives a tensor of the 
same type. 
(iii) Multiplication of components of two tensors gives a tensor whose type 
is the sum of the two. 
(iv) Contraction on pairs of indices, one of which is up and the other down. 


An equation that involves components combined using only these operations is 
called a ‘tensor equation’. It follows from exercise 2.10 that if a series of oper- 
ations performed in a certain basis gives a tensor equation, then that equation is 
true in all bases. This often permits a convenient choice of basis for a particular 
calculation. 


2.28 Functions and scalars 

A scalar is defined as a (9) tensor, ie. a function on the manifold whose 
definition does not depend upon the choice of any particular basis. For example, 
the contraction V‘w; is a scalar, since its value is independent of the particular 
basis in which the components are computed. On the other hand, the compo- 
nent V' is also a function on the manifold, having a numerical value at every 
point; it is not a scalar because its value depends on the basis. Put another way, 
there is some (scalar) function f(P) such that Υ (9) = ΓΡ) when the index ‘1’ 
refers to some particular basis; when that basis is changed, the new V'(P) will 
not equal f(P). So f(P) is a scalar, whose value happens to equal that of the one- 
component of V in some basis. But V! is not a scalar since its value changes with 
a change in basis. You see, therefore, that whether a thing is a ‘scalar’ or simply 
a ‘function’ depends on its interpretation when the basis is changed, rather than 
on its actual value. 


2.29 The metric tensor on a vector space 
Most familiar vector algebras involve an inner product between vectors, 
asin §1.5. This is a rule which associates a number (the ‘dot product’) with two 
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vectors. It is a linear function of both vectors. Therefore it is a (9) tensor, which 
is called the metric tensor, ΟΙ. Thus we define 
+ g(V, U) = g\(U,V) = U-vV. (2.44) 
The first equality above is a demand that U- V should not depend on the order 
of U and V. We say that gj is a symmetric tensor. Its components on a basis 
{é;} are 

Si = ONG, 2) = 2° &. (2.45) 
These components form ann x n symmetric matrix. For reasons explained later, 
we also demand that this matrix have an inverse. If it happens that the matrix is 
the unit matrix, i.e. if 

δη = δι), 
we say the metric tensor is the Euclidean metric, and the vector space is called 
Euclidean space. But what can we say if g;; is not this simple? Well, we are 
always free to try to choose a new basis {é,;}in which the new metric com- 
ponents, 

δι 1’ = AP My Spi, (2.46) 
are simpler. Consider this equation as a matrix equation. It is helpful to rewrite 
it as , 

δι 1’ = A SEA j' . 
From the discussion in $1.6 it is easy to see that this is the matrix equation 

g = A'gA, (2.47) 
where ΛΣ is the transpose of the matrix A whose entries are A" ,'. We will now 
see that a clever choice of A will reduce the matrix g’ to a very simple form. 
Since A is arbitrary, we will take it to be the product of two matrices 

A = OD, (2.48) 
where O is an orthogonal matrix (O' = 071) and D is a diagonal matrix (so in 
particular D? = D). Then we have, from equation (1.41), 


AT = (OD)' = ΡΤΟΣ = DO" 


and 

g = DO'gOD. (2.49) 
It is well known that any symmetric matrix, such as g, can be reduced to diag- 
onal form, gg, by a similarity transformation using an orthogonal matrix, so let 
us choose O to do this: 


δα > Oo” gO, 
g = Dg,D. 
If gq is the matrix diag(g,,22,...,8,) and our as yet undetermined matrix D is 


diag(d,,d,,...,d,,), then g’ is 
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g’ = diag(g,d;’,g2d2",...,8ndy’). (2.50) 
We now choose d; = (|g;|)””, so that each element on the diagonal of g’ is 
either + 1 or — 1. We cannot use d; to change the sign of g;, only its magnitude. 
Now, the diagonal elements of gg are the eigenvalues of g, and are unique apart 
from the order in which they appear. Moreover, since g has an inverse, none of 
the eigenvalues is zero. If we choose O to make all the negative ones appear first, 
then we have proved the theorem that any vector space with a metric tensor has 
a basis on which the metric tensor has the canonical form diag(— 1,...,—1, 
1,..., 1). Such a basis is said to be orthonormal. The sum of these diagonal 
elements — the trace of the canonical form — is called the signature of the 
metric. 


Exercise 2.12 
Find the matrices A which cast the following matrices into their unit 
diagonal form: 


2 1 ο 1 4 0 
ω [ ]. ) ί η, ©) : η. 


This theorem is very important. It means that there are only a few different 
kinds of metric tensors on a vector space. If the metric is positive-definite, then 
its canonical form must have all + 1s, and the space is Euclidean. If the metric 
is negative-definite it is also said to be Euclidean, since what is important for the 
space is whether the signs are all the same or not. If the metric is not of definite 
sign, it is called indefinite. An important case is the canonical form (— 1, 
1,..., 1), whose metric is usually called a Minkowski metric; special relativity 
has such a metric for n = 4, which we will discuss at length shortly. 

Another consequence of this canonical form is that it picks out a preferred 
set of bases for the vector space, the orthonormal bases. In Euclidean space E”,, 
such a basis is called Cartesian. In it the metric tensor has the components 
δι = δῃ, or in matrix form g = J. A transformation matrix Λο from one such 
basis to another satisfies 


T= AéIAg = AG = Λά. (2.51) 
So the orthogonal matrices are the transformations between Cartesian bases. 
These matrices form a group (the product of two orthogonal matrices is ortho- 
gonal), which is called the Euclidean symmetry group O(n). A Minkowski metric 


likewise singles out its preferred Lorentz bases in which the metric components 
form the matrix 


η = diag(—1,1,..., 1). (2.52) 
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A transformation matrix A;, from one Lorentz basis to another satisfies 

n = AL nAy. (2.53) 
Such a matrix is called a Lorentz transformation. It is not hard to show that 
these too form a group, called the Lorentz group L(n) or O(n — 1, 1). 

From the point of view of tensor algebra, the metric tensor’s most important 
role is one we have not yet mentioned: it maps vectors into one-forms in a 1-1 
manner. Consider a vector V. Then g|(V,_) is, for fixed V, a linear function of 
vectors into real numbers: a one-form. We denote this by 

V= αι”, ). (2.54) 
The fact that we demanded that the matrix g,; have an inverse is what makes 
this map 1-1: there is only one vector V mapped to V. To see how this works, 
let us look at the component version of this equation. Denote the component 
of V by V;: 
Vg, 2) = Via = ευ”, 
where the last equality follows from the symmetry of g|. Now, the inverse 
matrix to g;; will be called οὐ: 
¢ ος], = S54. (2.55) 
Then we have 

gty, = εσυ) = 6*,Vi = Vv", (2.56) 


which shows that the map is invertible: the metric provides a unique pairing 


II 


| 


between one-forms and vectors. This pairing can be summarized: 

9 Vi = ευ, (2.57) 
+ Vi= giky,. (2.58) 
Notice that we have denoted the elements of the inverse matrix by οὗ, and this 
permits (2.58) to obey the usual index conventions for a tensor equation. But 


for consistency one must show that the numbers g” are in fact components of a 
(2) tensor. This is the object of the next exercise. 


Exercise 2.13 

(a) Show that {g"} are the components of a (6) tensor ϱ| 1, either by 
showing that they transform properly, or that they define a bilinear 
function on one-forms. 

(b) Show that if a vector basis {6/3 is orthonormal, so is its dual one-form 
basis {G5'}, in the sense that g/'(@', @/) = + 5". 
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In the same way, the metric can map a (2) tensor A into a (1) tensor: 


Al, = 9A’. (2.59) 
In turn this can be mapped into a (2) tensor 

Ay = 81mA™; = &im&jnA™, (2.60) 
which can be mapped back into the original tensor 

Al = gilgkm 4 (2.61) 


These maps are called index raising and lowering, and it is conventional to give 
all these tensors the same name (e.g. A), distinguishing them only by the posi- 
tions of their indices. It is sometimes unimportant in vector spaces with metrics 
to say whether a tensor is of type (43) or of types (> 1), (4 *}), etc. and then 
one speaks only of the order of a tensor, which is N+ N’. 

In a Euclidean vector space a Cartesian basis has g;; = 6 ;, so that g” = 6”, 
and U' = U;: there is no difference between the components of a vector and of 
its associated one-form in this case. This is the reason that elementary dis- 
cussions of Euclidean vector algebra fail to distinguish between vectors and 
one-forms, and also why they confine themselves to orthonormal bases. But 
in a nonorthonormal basis for Euclidean space and in any basis in an indefinite 
metric space, the components of a one-form can be very different from those of 
its vector. We will see an interesting example of this in the section on special 
relativity, $2.31 below. 


2.30 The metric tensor field on a manifold 

A metric tensor field Οἱ on a manifold is a (8) symmetric tensor field 
which must have an inverse at every point. At every point P it serves as a metric 
on the tangent space ΤΡ, and all the properties discussed in the previous section 
carry over directly. But there is much more. 

The definition of a certain (9) tensor on a manifold M as the metric of the 
manifold endows M with a very rich structure. It immediately becomes ‘rigid’: 
one can define such notions as distance (see below) and curvature (see chapter 
6). These notions are so important in many applications, particularly in general 
relativity, that it is this sort of geometry that a physicist is most likely to be 
familiar with. But this is, from the point of view of differential geometry, a 
‘higher level’ structure: one goes beyond the notion of a simple differentiable 
manifold by picking out a certain tensor field as special. In doing this one may 
overlook the rich geometrical structure of the ordinary manifold itself. Such 
important tools as Lie derivatives and differential forms have nothing to do 
with metrics. Accordingly we shall put the metric tensor very much in the back- 
ground in this book, even in applications to manifolds on which one is defined. 
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In this section we take a brief look at its simplest properties. Further develop- 
ment of metric geometry itself is deferred to chapter 6. 

The metric tensor may be as differentiable as one requires, but it must at 
least be continuous. This implies that its canonical form must be a constant 
everywhere, since it is composed only of integers, and integers cannot change 
continuously. So we speak of the signature of the field g}. As long as one can 
choose the basis transformation matrix A freely at each point, one can trans- 
form from any given basis field to a globally orthonormal basis in which the 
components of ϱ| are its canonical ones. But this transformation field A is 
usually not a coordinate transformation (i.e. it does not satisfy (2.43)), and in 
fact it is generally impossible to find a coordinate basis which is also ortho- 
normal in any open region U of a manifold M (see exercise 2.14). The obvious 
exception is R” considered as a manifold with the Euclidean metric 5,; at 
every point. But even here only the Cartesian coordinates generate an ortho- 
normal basis. An example of this is given in exercise 2.1 for polar coordinates 
in R?. The coordinate basis is orthogonal but not normalized. The rescaled 
orthonormal basis is not a coordinate basis. 


Exercise 2.14 

Show that aC™ metric tensor field ϱ| is locally flat, in the sense that 
any point P has a neighborhood in which there exists a coordinate 
system on whose basis the components g;; have the following proper- 
ties: 


(i) g,(P) = + 6;; (orthonormal form at P) 


(ii) 


(iii) 


(a) 
(b) 








ὃς.. 
oe = 0 (orthonormal form a good approximation near P) 
Ρ 
07g 
i of ;| not necessarily all zero (no truly orthonormal 
Ox" 0x" | p 





coordinate system) 


Exercise 2.15 

In polar coordinates in the Euclidean plane, find the components of 
the metric on 

the basis {0/dr, 0/00}; 

the basis t, @ of exercise 2.1. Express ἓ, @ in terms of 0/dr and 9/90. 


Exercise 2.16 
Find the components of df and the vector df on both bases of exercise 
2.15. 
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Here a word of caution is in order. Most treatments of vector calculus in 
curvilinear coordinates in Euclidean space use the components of a vector on 
this kind of orthonormal basis. This permits them to avoid distinguishing 
between vectors and one-forms. But when one compares the expressions we 
obtain below for, say, the divergence of a vector field in terms of its compo- 
nents with expressions given in other treatments, one must allow for possible 
differences of basis. 

An important property of the metric is that it permits a definition of length 
on the manifold. If a curve has tangent V = dx/dA, then a displacement dd has 
squared length 


αἱ = de : αχ 


(Vdd) - (Vda) = V- Vdd)? 
g\(V, V) da’. (2.62) 
(Here the symbol ‘d’ is the infinitesimal, not the gradient.) If a metric is positive- 


definite, then g|(V, V) > 0 for all V #0. In such a case d/? is positive and we 
have 


II 


di = (g/(V, V))? da (2.63) 
as the length of an element of the curve. In an indefinite metric, however, the 
squared length is not of definite sign. Curves are distinguished by having d/? 
positive (‘space-like’) or negative (‘time-like’). Then one defines the real number 


dl = |g\(V, Vy"? da (2.64) 
to be ‘proper distance’ for space-like curves and ‘proper time’ for time-like 
curves. It is zero for ‘null’ curves. When one has an indefinite metric one must 


be careful to distinguish a vector of zero norm from a vector which is truly zero 
(all its components vanishing). 


2.31 Special relativity 

The vector space R* equipped with a metric of signature + 2 and con- 
sidered as a manifold is one of the most important manifolds in physics: it is 
Minkowski spacetime, the spacetime of special relativity. Elementary treatments 
of special relativity often do not introduce the metric tensor explicitly, but they 
do provide us with all we need to see what the metric must be. In particular, we 
know that there exists a preferred set of coordinate systems for spacetime, called 
Lorentz frames, and that if two events are separated by coordinate intervals 
(At, Ax, Ay, Az) in such a frame, the number 


As* = —c?(At)* + (Ax)? + (Ay)? + (Azy’ (2.65) 
is independent of the Lorentz frame. (Here c is the speed of light.) Let us rescale 
our coordinates by defining x° = ct,x'! =x,x? =y,x? =z. (It isa common 
convention to use numerical indices beginning with O rather than 1 in relativity.) 
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Let us also follow the convention of letting Greek letters represent spacetime 
indices. This will help us distinguish discussions applicable only to relativity 
from those of more general scope. Then equation (2.65) has the form 


As? — — (Ax°)? + (Ax!) + (Ax?) + (Ax?)* 
= Nogdx*Ax®, (2.66) 
where Ώαρ is the matrix 
Nog = diag(— 1, 1, 1, 1). (2.67) 


We now interpret (2.66) as defining the pseudo-norm (§1.5) of a vector Ax 
whose components are (Ax°, Ax', Ax’, Ax*). It is easy to see that this pseudo- 
norm satisfies axioms (Nii) and (Niv) of §1.5, which are required for an inner 
product to be defined. This is clearly 

VW = nogV°V®, (2.68) 
and so we see that Ώαρ is in fact the metric tensor in canonical form and the 
Lorentz frame is the associated orthonormal basis. 

This metric gives a good illustration of the difference between the com- 

ponents of a vector and its associated one-form. In a Lorentz frame 
Uy = πού. = —U°*, (2.69a) 
U, = U*,U, = U*,U,z = U*. (2.69b) 
Consider the vector gradient of a function f, which is the vector mapped from 
the one-form df. The gradient df has components (df/0x° , of/dx',...) while 
the vector df has (— df/dx°, af/ax!,...). Many treatments of special relativity 
introduce the gradient as a vector operator with components (— 0/dx°, 
d/dx',...). This odd sign is a clumsiness forced by the fact that the gradient 
is really a one-form. 

A manifold M with a metric g| is called Minkowski spacetime only if there 
exists a single coordinate system covering all of M in which g| has components 
Nag. This coordinate system is a good one to work in, but it is not the only one 
possible on M. One can perfectly well choose others, such as those associated 
with accelerated observers. Provided one follows the general rules of differential 
geometry one will get the correct physical results. 
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The Euler angles and other parameterizations of the rotation group 
are discussed at length in H. Goldstein, Classical Mechanics (Addison- 
Wesley, Reading, Mass., 1950). The student may find it helpful to read 
Goldstein’s treatment in conjunction with our discussion of the rota- 
tion group in chapter 3 below. 


3 LIE DERIVATIVES AND LIE GROUPS 


3.1 Introduction: how a vector field maps a manifold into itself 

In the previous sections we have developed certain aspects of index 
notation. This notation is often essential for dealing with actual numerical 
computations; but it is just as often a hindrance in developing a sound geometri- 
cal idea of what the mathematics means. We begin by defining vectors and 
tensors in a manner independent of any basis, and we now continue in this spirit 
to develop what is one of the most useful analytic tools in geometry: the Lie 
derivative along the congruence defined by a vector field. 

We have mentioned the idea of a ‘congruence’ in §2.12: a set of curves that 
fill the manifold, or some part of it, without intersecting. Each point in the 
region of the manifold M is on one and only one curve. Since each curve is a one- 
dimensional set of points, the set of curves is (n — 1)-dimensional. (With some 
suitable parameterization, the set of curves is itself a manifold.) The key point 
from which everything else follows is that the congruence provides a natural 
mapping of the manifold into itself. If the parameter on the curves is A, then any 
sufficiently small number AA defines a mapping in which each point is mapped 
into the one, a parameter distance AA further along the same curve of the 
congruence (see figure 3.1). This is a 1~1 mapping, at least in any region in 
which the vector field is sufficiently well-behaved (a C’ field will do). If the 
vector field is C”, the mapping is a diffeomorphism (see §2.4.). If the map exists 


Fig. 3.1. The mapping of M to itself defined by mapping each point to 
the point on the same curve of the congruence whose parameter is some 
fixed number AA larger. 
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for all AX, there is a one-dimensional differentiable family of such mappings (a 
one-parameter Lie group, in fact, with composition law AA, + Δλ2). Such a 
mapping is called a ‘dragging’ along the congruence, or a Lie dragging. 


3.2 Lie dragging a function 

If a function fis defined on the manifold, then the mapping defines a 
new function fA, by ‘carrying’ f along the congruence in an obvious manner: if a 
point P on a certain curve in figure 3.1 is mapped to the point Q, a parameter 
distance Δλ further along the same curve, then the new field fA has the same 
value at Ο as fhad at P, 


¢ FP) = far). 
(Here the asterisk on fA, simply means ‘new’.) If it happens that the value 
FAn(Q) in fact equals the old value at the point Q, f(Q), for all Q, 


f = faa, 
then the function is invariant under the mapping. If the function is invariant for 


all AX then it is said to be Lie dragged. Clearly, a function that is Lie dragged 
must be constant along any curve of the congruence: df/dA = 0. 


3.3 Lie dragging a vector field 
To see the effect this map has on vector fields, recall that any vector 
field is defined by the congruence of curves for which it is the tangent field. In 


Fig. 3.2. How a new vector field d/duA is defined by Lie dragging its 
path and its parameter µ. Curves (1)—(4) are members of the A- 
congruence. Curve (A) is a u-curve passing through P and is mapped to 
curve (4) by being Lie dragged a parameter distance AA. Curve (B) is a 
µ-ουτνε of the old congruence also passing through Q. The image of (B) 
under the dragging is not shown. In general (B) and (4) will be different 
curves. If they are the same the u-congruence is said to be Lie dragged. 





(1) (2) (3) (4) 
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figure 3.2, we show two congruences: one, for d/dA, generates a map of the 
manifold; the other, which defines the arbitrary field d/dy, will be acted on by 
the map. This action is very simple: any curve of the u-congruence is mapped 
into a new curve which runs through the images under the Lie dragging of the 
points it used to run through, and the parameter values µ are carried along to the 
new points as well. This defines a new congruence with parameter μἈλ. This new 
congruence has a tangent vector field d/du’,,, which is called the image of d/du 
under the Lie dragging. 

In general the uw ,-congruence is different from the u-congruence. If it is the 
same, then d/du’, = d/du everywhere and we say the vector field and congru- 
ence are invariant under the map. If they are invariant for all AX then we say 
they are Lie dragged by the vector field ά/ἀλ. 

A Lie dragged vector field has a simple geometric interpretation, illustrated 
by figure 3.3. It is clear that (in the limit of infinitesimal Ad and infinitesimal 
separation between curves (2) and (3)) if d/du at P ‘stretches’ exactly from P to 
R on curve (A), then ἆ/[άμἈλ stretches exactly from Q to S on (4’). If d/dy is Lie 
dragged, then curve (B) of figure 3.2 coincides with (A’) and (d/duAy eg 
= (d/du)g, 5ο d/dy also stretches from Q to S. Referring to our discussion of Lie 
brackets in §2.14, we find that this implies [d/dA, d/du] = 0: a vector field is 
Lie dragged if its Lie bracket with the dragging field vanishes: 

[ά/άλ, d/du] = 0. (3.1) 

There is another way to see the same thing. Suppose we look at figure 3.3 dif- 
ferently, as if we were given only a single curve (A) with parameter yu, not a 
whole congruence. Then we can generate from this curve a whole congruence by 
Lie dragging it for all possible values of AX. One such curve is (A’). Let us call 
this field d/du;, with parameter pyz,. By this construction, the derivative d/da” is 


Fig. 3.3. The central section of figure 3.2, with curve (B) omitted and 
the tangent vectors to (A) and (A’) at P and Q respectively drawn in. 
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always on a curve of fixed µ and the derivative d/duz, is always on a curve of 
fixed λ. Therefore they must commute. 


3.4 Lie derivatives 

The concept of dragging permits the definition of a derivative along the 
congruence. There is a difficulty inherent in any attempt to define derivatives of 
vector and tensor fields. Consider trying to define a vector field’s derivative as 
the limit of the difference between the vectors at different points divided by the 
distance between the points. One problem is defining “distance’ between points; 
if one has a curve between the points, one can take this to be the difference 
between the parameter values at the points. (This gives a derivative with respect 
to the parameter, and on manifolds without metrics this is all one can hope for.) 
A more serious problem is the comparison of vectors at different points: are two 
vectors at different points ‘parallel’ or not? In the Euclidean plane this is a 
simple question to answer. On a curved surface it may not have a unique answer. 
On a simple differentiable manifold the question of parallelism at different 
points does not even make sense, since there are no ‘markers’ or rules for moving 
vectors around in a parallel manner. One must add more structure — called an 
‘affine connection’ — to the manifold in order to define an absolute parallelism. 
This is treated in chapter 6 on Riemannian geometry. What we shall consider 
here is an alternative that one should expect to find useful in any problem in 
which a congruence plays a central role. The congruence itself can provide a sub- 
stitute for the concept of parallelism at different points. That is, when compar- 
ing vectors at points A and A + Δλ on a certain curve, one can Lie drag the 
vector at λ + Ad back to the point A. This defines a new vector at A, which can 
be subtracted from the old one to define the difference between them. Notice 
that this is a unique difference, and hence a unique derivative, given the congru- 
ence. But it does depend on the congruence. 

Let us derive analytic expressions for this. First consider a scalar function. 
Evaluate the scalar at the point Ag + AA, drag it back to Ag, subtract the value of 
the scalar at Ao, divide by AA and take the limit AA > 0. Its value at Ag + AA” is 
f(\o + AA). By dragging one defines a new scalar field f*, whose value is defined 
by the rule df*/dd = 0. Therefore its value at Ag is the same as at Ag + AX: 
f*(Xo) =f(Ao + AA). The derivative so defined is 


km { ο) FO), {ο AN =SQo) _ Ee 
im ———-- = odin = |--- 
Δλ-»0 Δλ Δλ-»0 Δλ 


The result for the Lie derivative of fis not, of course, surprising. There is a 
special notation for the Lie-derivative operator: £5, where V is the vector field 
generating the mappings (d/dA in our case). We have proved that for functions 
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+ £of = Vif) = ἁ[άλ. (3.3) 

Now we do the same for a vector field U = d/du. Since a vector is defined by 
its effect on functions, we use an arbitrary function fin what follows. At λρ the 
field U gives the derivative (df/du),,, while at Ag + Δλ it gives (df/du), 4 an. By 
dragging U(Ay + AA) in the sense of §3.3, one gets a new field U* = d/dy*, 
defined by (U*, V] = Oand by ὃ (λο + AA) = (Xo + Ad). The vanishing of 
the commutator implies 


d d d d 


and = du* da? (3.4) 


everywhere. Therefore we have (for analytic vector fields) 


d d d/d 
fl [ης —~arj—{(—<f]| + o(ar 
a Ro ή, Entra] Xo (An) 


a f — Ar Pas ὴ + Ο(Δλ2) 
duo y+ aa du"\dr" } |x, 











=~ {2 rl +ari2(o 
ld η dn au! r, 
-- d d 2 
on) {e i). + O(AX). 


We define the Lie derivative £>U as the vector field which operates on f to give 


[Foo Ma) 


eee Cb (3.5) 


— 
j—ae 


[£7U](S) 





dd dd 
lim μα —f- Par —f . 
Δλ-ο \dA du du* ἆλ 
Now, the difference between μ” and µ is clearly a term of first order in AA, 


which means we can replace μ΄ by µ in the last equation above. Since this 
equation is true for all f, we have 


— d-—~ de — 
4 £{£pU = —U—-—V = [V,U}. 3.6 
ντ [0-7 v= 10,0) 6.9 


This is again a sensible result. By definition of the Lie derivative along V, a 

vector field has a zero Lie derivative if it is Lie dragged, i.e. if it has zero Lie 
bracket with V. Therefore it makes sense that its derivative is in fact its Lie 

bracket. By the antisymmetry of the Lie bracket we find 


£oU = --4Ργ. (3.7) 


3.5 


(a) 


(b) 


(a) 


(b) 


Lie derivatives and Lie groups 78 


Exercise 3.1 
Show that, on functions and fields, 


fv, £0] = fry (3.8) 
for any two twice-differentiable vector fields V and W. 

Prove the Jacobi identity for Lie derivatives on functions and vector 
fields: 


[[£z, εν]. fz] + ([£y, £2], £z] + [[£z, £z], £e] = 0, (3.9) 
where X, Y, Z are any three-times-differentiable vector fields. 

(Hint: for (a) on vectors, show that (3.8) is equivalent to (2.14). For 
(b) on vectors, use (3.8) and the fact that, as is obvious from its defi- 
nition, £4 + £— = £4.35.) 


Exercise 3.2 
Deduce the Leibniz rule 


LAfU) = (E~f)U + fioU (3.10) 
from the definitions of £7 on functions and vector fields. 
From (2.7) we know that the components of £7U on a coordinate basis 


are 


_ . 0 9 
(fpUy = V 


η U'—U! Si V'. (2.7) 
Given an arbitrary basis {e;} for vector fields, show from (a) that 
(£pU) = V'e(U') — σεν) + VU" (£5.24), (3.11) 
where e;(U*) means the derivative of the function U’ with respect to 
the vector field e;. 


Exercise 3.3 
Show that if one chooses a coordinate system in which V is a coordi- 
nate basis vector, say 0/dx', then for any vector field W 


(Εν)! = ow'/ax!. (3.12) 
That is, the Lie derivative is the coordinate-independent form of the 
partial derivative. 


Lie derivative of a one-form 
Since fields of one-forms and tensors of higher rank are defined in terms 


of vector fields and scalar functions, one can deduce the Lie derivatives of one- 
forms from the Lie derivatives of vectors and scalars. Conceptually, the definition 
is the same: a one-form field is said to be Lie dragged if its value on any Lie 
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dragged vector field is constant. The Lie derivative is found by dragging the one- 
form at Ag + AA back to Ag and taking the difference. The result is that if @ is a 
one-form, then £7 is the one-form field which is the Lie derivative of ὢ along 
V defined by the product rule (just the Leibniz rule for first-order derivatives): 
£7[G(W)] = (£pS)(W) + BLpw) (3.13) 
for all vector fields W. Since G(W) is simply a function, this defines £7 in 
terms of known operations, the Lie derivative of functions and vector fields. 


Exercise 3.4 

From (3.13) and the expression (2.7) for the components of 

£7W =[V, W], deduce that £>@ has components, on a coordinate 
basis, 


. O 


0 ; 
Sy) ωι + wa VV!” (3.14) 


1 Ox! 


The natural extension of (3.13) to tensors of higher type gives the Lie deriva- 
tive the properties. 


£7(A Φ8Β) = (£7A)©B+A ® (£7B) (3.15) 
εφ(τ(ῶ.... 0... .)) = (£¢T)\(@,...5U,...) 
+ T(£p@,...3;U,...)4+... 
+1(@,...3£7U,...)+..., (3.16) 


where A, B, T are arbitrary tensors and @ and U arbitrary one-form and vector, 
respectively. 


an 


3.6 Submanifolds 

A submanifold of a manifold Μ is a manifold which is a smooth subset 
of M. If M is ordinary three-dimensional Euclidean space, then ordinary smooth 
surfaces and curves are submanifolds. In four-dimensional Minkowski spacetime 
(§2.31), the three-dimensional space of events simultaneous to a given event in 
the view of a particular observer (same time coordinate {) is a submanifold, and 
so is the hyperboloid of all events at constant interval As” from a given event. 
The word ‘hypersurface’ is sometimes used instead of ‘submanifold’, but some 
textbooks use ‘hypersurface’ only to describe a submanifold whose dimension is 
one less than that of M. 

Although the idea of a submanifold is easy enough to visualize in simple 

cases, the word ‘smooth’ in the definition given above needs to be made more 
precise, and different textbooks give different (and inequivalent) definitions. We 
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shall use the one which guarantees the greatest smoothness and is closest to our 
definition of a manifold. An m-dimensional submanifold S of an n-dimensional 
manifold M is a set of points of M which have the following property: in some 
open neighborhood in M of any point P of S there exists a coordinate system 

for M in which the points of S in that neighborhood are the points characterized 
by xt =x? =...=x""™ = 0 (see figure 3.4). A one-dimensional submanifold is 
a kind of curve, and its smoothness requirement is illustrated in figure 3.5. It is 
clear that the definition of S guarantees it is itself a manifold, since it has the 
requisite coordinate patches (charts). A special case is m =n: any open set of M 
is a submanifold of M. 

Our interest in submanifolds stems mostly from the fact that solutions of 
differential equations are usually relations, say {y; =f,(x',...,x™),i=1, 

. ,p}, which can be thought of as submanifolds with coordinates {x',... , x} 
of a larger manifold whose coordinates are {y;,... Vp» X', ...,x}, We shall 
begin our investigation of submanifolds from a different perspective, however, 
and the tie-in with differential equations will not come until chapter 4. 

Suppose P is a point of a submanifold S (dimension m) of M (dimension n). A 
curve in S through P is also a curve in M through P, so naturally a tangent vector 
to such a curve at P is an element of both ΤΡ, the tangent space to M at P, and 
Vp, the tangent space to S at P. In fact, Vp is a vector subspace of Tp of dimen- 
sion m. On the other hand, an arbitrary vector of Tp not in Vp has no unique 
‘projection’ onto Vp (recall there is no notion of orthogonality in general). 


Fig. 3.4. A two-dimensional submanifold S of a three-dimensional mani- 
fold M is shown, along with coordinates near a point P which satisfy the 
definition given in the text. The coordinate line of x! intersects S only 
at P. 





Fig. 3.5. A candidate for a one-dimensional submanifold of a two- 
dimensional manifold, which fails because it crosses itself at P. At P one 
cannot construct the necessary coordinates. Only some curves, there- 
fore, are submanifolds. 


3.7 Frobenius’ theorem (vector field version) §1 


The situation for one-forms at P is just the reverse. Let T*p be the dual of Tp, 
the set of one-forms at P which are functions defined on all of Tp. Similarly, let 
V* p> be the dual of Vp, the one-forms S itself has at P. Any one-form in T*p 
defines one in V*p: this only involves restricting its domain from all of Tp down 
to its subspace Vp. But there is no unique element of 7*p corresponding to a 
given element of Vp, since simply knowing the values of a one-form on Vp does 
not tell us what its value will be on a vector not in Vp. 

In summary, then, a vector defined on a submanifold S is also a vector on M, 
and a one-form on M is also a one-form on S. But neither statement is reversible. 
We will discuss one-forms and submanifolds again in chapter 4. Here we shall 
concentrate of vector fields. 


3.7 Frobenius’ theorem (vector field version) 

In any coordinate patch of S there are coordinates {y*,a=1,...,m} 
and basis vectors {0/dy°} for vector fields on S. All these basis fields naturally 
commute: 


[/9γ5, a/ay?] = ο. (3.17) 


Exercise 3.5 

(a) Show that if V and W are linear combinations (not necessarily with con- 
stant coefficients) of m vector fields that all commute with one another, 
then the Lie bracket of V and W is a linear combination of the same m 
fields. 

(b) Prove the same result when the m vector fields have Lie brackets which 
are nonvanishing linear combinations of the m fields. 


From exercise 3.5(a) it follows that any two vector fields on S have a Lie bracket 
which is also tangent to S, since these fields are certainly linear combinations of 
the commuting fields {0/dy"}. The important statement is the converse: if a set 
of m C~ vector fields defined in a region U of M have Lie brackets with one 
another, all of which are linear combinations of the m vector fields, then the 
integral curves of the fields mesh to form a family of submanifolds. Each sub- 
manifold has dimension equal to the dimension of the vector space these fields 
define at any point, which is at most m, but which may be smaller (as in 53.9 
below). Each point of Uis on one and only one such submanifold, provided that 
the dimension of the vector space defined by the fields is the same everywhere in 
U. This family of submanifolds fills U in much the same way as a congruence of 
curves does (§2.12), and it is called a foliation of U. Each submanifold is a leaf 
of the foliation. Two foliations are illustrated in figure 3.6. 
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This result is called Frobenius’ theorem. The proof is sketched in the next 
section, but it is easy to see the central idea. If the integral curves of the various 
fields are to define a submanifold, they must remain tangent to it: no curve can 
start ‘sticking out’ off it. This tangency is guaranteed if all the Lie brackets are 
themselves tangent, since the Lie brackets are simply the derivatives of the vari- 
ous vector fields along one another. If no vector field has a derivative with a 
component off the hypersurface, then no integral curve can leave the hyper- 
surface. See figure 3.7 for some examples. When we come to the study of differ- 
ential forms, we shall encounter another version of Frobenius’ theorem, which 
will show us that it is the fundamental theorem giving conditions for the exist- 
ence of solutions to partial differential equations (‘integrability conditions’). 


Fig. 3.6. (a) A foliation of R? by parallel planes. Each point of R? is on 
one plane of the foliation. Only a few such planes are shown. (b) A 
foliation of R? by concentric spheres 53, The centre is a degenerate 
point of the foliation. 





(4) (0) 


Fig. 3.7. In R°, the vector field dx/dA = — sin A, dy/dA = cos A, dz/da 
= 1 spirals in the vertical direction with a spiral radius of 1. (a) The 
spiral field and the x-basis vector field form a family of surfaces, each 
point in R? being on one. One such surface is illustrated as a wavy (but 
not twisted) ribbon. In this view looking slightly down toward the x—y 
plane we sometimes see one side of the ribbon (horizontal striping) and 
sometimes the other (longitudinal striping). (b) Two vector fields which 
do not form a submanifold are the spiral one and the z-basis vector field. 
The plane defined by the two at any point is not tangent to the ‘next’ 
spiral curve above or below it. 


(a) (d) 


3.8 Proof of Frobenius’ theorem 8&3 


3.8 Proof of Frobenius’ theorem 

Suppose in some open region U’ of M we are given m’ vector fields 
which at every point P of U’ spana subspace of Tp of dimension m <m’. (The 
set of all these subspaces is called an m-dimensional distribution on Μ. This has 
no relation to the delta-function distributions of §2.18.) At least in some neigh- 
borhood U of any point P in U' we can choose m of the fields as a linearly inde- 
pendent basis for the set, and these fields {(V,.,,a =1,..., m} will (by exercise 
3.5(b)) have the property 


[Vays χω] = 2 Qabe Vio) (3.18) 


in U. So we never really need to consider the case where the fields are not linearly 
independent: such a set reduces locally to a linearly independent set of smaller 
dimension. Let the manifold M have dimension n. 

The theorem is trivial when there is only one vector field V (i.e. m = 1). The 
integral curves clearly exist in U if V(P) #0. Each curve is a one-dimensional 
manifold, a submanifold of M. 

The theorem for m 2 2 will be proved by induction. First we will establish a 
formula we will find useful in the proof. From equation (3.14) it is easy to prove 
that for any function f and vector field V, 


£y(df) = d(£¢/). (3.19) 
Moreover, equation (3.15) implies, for any vector field W, 
£Xdf,W) = (Leds, W)+ (df, £7). (3.20) 


Combining these two equations and remembering that £~W = [V, W], we find 
the result we shall need: 
(df, [V,W]) = £p(df, W) —(d(£<f), W). (3.21) 

Returning now to the main proof, we first note that if the m vector fields all 
actually commute (have zero Lie brackets with one another), then the construc- 
tion of §2.15 shows that they define a coordinate system for the points on their 
integral curves, hence the required family of submanifolds of M. We shall prove 
that the submanifolds exist in the general case (Lie brackets linearly dependent 
on the fields) by constructing m linearly independent linear combinations of the 
original fields which do commute. Thus, suppose we have m linearly independent 
vector fields Viz) whose Lie brackets are linearly dependent on the fields. We 
select any one, say Viny = d/dAgmy. Now the parameter Aqny along the Viny 
congruence is a number defined at every point, so it is a function on the region 
U of M we are looking at. Accordingly, its gradient dm) exists, and we use it as 
follows. We define (m — 1) vector fields X(4) which are linear combinations of all 
the original Y,)s and which satisfy 
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(ἁλμω. ἆω) = 0, a=1,...,m—1. (3.22) 
This determines the set {X,)} up to linear combinations of themselves. Now we 
write (again by exercise 3.5(b)) 


m-1 

[Χω, Χο] . » BabcX (c) + Yap Vim); (3.23) 

ee oe _ 

[Vimy X(a) | . » Map Xb) + να Επι): (3.24) 
b=1 


where βαρος, Yab> Map, and να are all functions on U. We contract these equations 
with daA,,, and use (3.21), (3.22), and the following simple identity 


ζάλη, Vimy? = Lv gqyh my = ἁλαμ/άλρω = 1=> d Lv ρλα) = 0. 
(3.25) 


The left-hand sides of both (3.23) and (3.24) contract to zero and the resulting 
equations imply y,, = Vz, 0. So, in particular, the Lie brackets of the Χιωδ do 
not involve Von) at all. This was the purpose of imposing (3.22) on their con- 
struction. 

We now invoke the inductive hypothesis, that any set of (m — 1) vector fields 
having Lie brackets linearly dependent on them form an (m — 1)-dimensional 
submanifold. This applies to the set (χω, a=1,...,m-—1}, which is therefore 
assumed to form a family of (m — 1)-dimensional submanifolds filling U. Define 
a set of vector fields (χω. a=1,...,m-—1} which form a coordinate basis for 
one of the submanifolds, say S’, so that these fields commute on S'’. We shall 
define fields Ζω. a=1,...,m—1} off’ by Lie dragging along Vint): 

Ζω = Ya) ons’ 
[Viny, Ζω] = 0 in U along any curve 
V,, passing through S’ (3.26) 

What we aim to prove is that the Ζ(ω5 commute among themselves every- 
where, as they do on S’. Then we will have constructed the fully commuting set 
{Vimy, Ziq), 4 = 1,...,m—1}and proved the theorem. But first we must 
establish that each Ζ(ω is still a linear combination of the Vqys. In fact we will 
prove that it is a linear combination of the Χ(ως alone, without ,,). Each field 
Za) is certainly unique, so let us see whether we can satisfy (3.26) with a linear 
combination 


Za) = > Wp Χρ). 
b 


fora=1,...,m-—1. 


Then we must have (all sums running from 1 to m — 1) 
0 = [Vins Z@] = £imyZ@ 


= a (£ Fy %ad) Χο) + daa L Mm» Xe] 
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b 
to achieve which, we have used (3.24) with vy, = 0. Redefining the summation 
indices in the last sum (b > c and c > ϱ) gives 


da, — _ 
2 Χ(υ + 3 QabMpcX (ο) (3 27) 
dr be 


m 


o=> ο + Lost χι. 


b ο 
Since the Y. (ω» are linearly independent, this requires 


dans 
Drm 


which is a set of ordinary differential equations. The initial conditions (at S’), 
that a, give the appropriate combination of ἆμως to form Yq), determine a 
unique solution, which always exists. Therefore, at every point the Z(a)8 are 
linear combinations of the ως. 

The final step is to observe that the Lie dragging preserves the fact that they 
commute: 

[Ζω. Zo) | = 0, α.δξ]1.....πι-- 1. (3.29) 

This can be proved by using the Jacobi identity, exercise 2.3, among the three 
fields Vimy, Zig), and Z(y). By construction we now have m fields {Vimy, Ζω. 
a=1,...,m-—1} which all commute and which therefore form a coordinate 
basis for a submanifold of dimension m. Since the original fields {Via} are 
linear combinations of these, we have proved the theorem. 





+) Wacken = 0, (3.28) 


6 


3.9 An example: the generators of S? 

Readers familiar with angular momentum in quantum mechanics may 
have found many of the ideas presented so far familiar. Consider the (unnormal- 
ized) $-basis vector of spherical coordinates, sometimes called é: 

ey = — ye, + Xey, 
where e, and ἐν are the usual Cartesian basis vectors. In our notation this 
becomes 


ὃ ὃ ὃ 
πω τα 
eT) Ox oy 
which we shall call /,, the ‘angular momentum operator’ for the z-direction: 
7 8 


(This differs from the usual definition in quantum mechanics by a factor h/i.) 
One can define ᾖ, and ly in analogous ways, and one finds the commutation 
relations (Lie brackets) 
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[1,1] = τς, 
[ῖψ. 1ε] . —I,, (3.30) 
[2,5 Le] στ —1,. 


Therefore the three vectors determine a submanifold. However, it would appear 
that this submanifold need only have dimension three — i.e. be all of the space — 
because there are three vectors. That it is really two-dimensional, we can see by 
realizing that if we define r = (x? + y? +z)”, then 1,(7) =1,(r) =1,(/) = 0. 
Put another way, 

dr(i,,) = dr(l,) = dr(i,) = 0. (3.31) 
From our picture of dr as a set of surfaces of constant r, and our interpretation 
of its contraction with, say, ᾖ, as the number of such surfaces 1, pierces, we see 
that (3.31) means that /,., ly, and 1, are all tangent to the sphere 7 = const. There- 
fore, at any point they are linearly dependent, and they generate a submanifold 
of dimension two: the sphere, of course. 


Exercise 3.6 
Show that exercise 3.3 is valid when W is replaced by any tensor field. 


Exercise 3.7 

Define the operator 

L* = £7, Si. + £7, fi, + £; whi, (3.32) 
Show that £;, and L? commute. By symmetry this also implies that L” 
commutes with £7, and fi, Show that if fis a scalar function, then 


1 of 1 ο 
Lf=—— θ + τσ 3.33 
Lif = sin ϐ 00 (i x) sin? 06?’ (3.33) 


where ϐ and ¢ are the usual spherical coordinates. That is, L?f is the 
angular part of V7f on the unit sphere. 





3.10 Invariance 

One of the principal uses of Lie derivatives in physics is to express the 
notion that a tensor field is invariant under some transformation. We say that a 
tensor field T is invariant under a vector field V if 


£oT = 0. (3.34) 
If T has physical importance — e.g. it might be the metric tensor, or a scalar field 


describing the potential energy of a particle, or a vector field of force — then 
those special vector fields (if any) under which T is invariant will also be 
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important. For example, in the preceding section we discussed the vector fields 
associated with rotations of the sphere. One knows that angular momentum will 
be important in a physical problem only if the problem is invariant under the 
rotations associated with at least one of the vector fields. For instance, if the sys- 
tem is invariant under rotations in some plane, it is said to be axially symmetric 
(or axisymmetric) and the angular momentum associated with the vector 
generating those rotations is conserved. How this comes about will be discussed 
in exercise 5.8. Here we shall look at invariance generally. 

The following theorem is of central importance to the whole theory of invari- 
ance. Suppose we have a set F = {T,, T2,... } of tensor fields whose invariance 
properties are being studied. Then the set of all vector fields V under which all 
fields in F are invariant is a Lie algebra, as defined in §2.14. The proof of this 
theorem has two steps. The first step is supplied by exercise 3.8, which shows 
that the set of fields is a vector space over the real numbers. 


Exercise 3.8 
Show that if a tensor T is invariant under both V and W then it is invari- 
ant under aV + bW, where a and ὃ are constants. 


The second step relies on the result of exercise 3.1(a), which applies as well to all 
tensor fields by (3.13) and (3.15). If V and W are vector fields in the set then for 
any tensor field Τι in F 

LoT; = {ῃΤι = O= (£7. £e]T; = 0Ξ fe HT; = 0. (9.35) 
Therefore [V, W] is in the set if V and W are. This proves the theorem. We will 
shortly see that Lie algebras are very closely related to Lie groups, and this 
theorem then explains some of the usefulness of Lie groups in physics. In the 
next sections we will study some examples of invariance. 

It is important to understand what sort of vector space this Lie algebra is. 
One usually thinks of a linear combination of vector fields V and W as another 
vector field aV + DW, where a and b are functions on the manifold. The linear 
combinations permitted by exercise 3.8, however, use only constants for a and 
b. The vector space we have constructed has the fields V and W as single 
elements; it is not a fiber bundle of which V and W are cross-sections. It is more 
like a finite-dimensional function space (see §2.3). This point may seem subtle, 
but it is important for understanding the dimension of the vector space. For 
example, the three vector fields 1, /,, and J, of the previous section are linearly 
dependent as vector fields on R®, since they are all tangent to S*. But to express 
one in terms of the other two one must use a linear combination with variable 
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coefficients. Therefore the three fields are linearly independent elements of the 
Lie algebra: no linear combination of them with constant coefficients (not all 
zero) equals the zero element of the algebra, which is the zero vector field. We 
therefore say that these vector fields are a basis for a three-dimensional Lie 
algebra. The only other Lie algebra we have encountered so far is the algebra of 
all tangent vector fields to a manifold or submanifold. No finite number of such 
fields can be a basis for linear combinations using constant coefficients, so we 
say that this Lie algebra is infinite-dimensional. 


3.11 Killing vector fields 

Many manifolds of interest in physics have metrics, and it is therefore 
of considerable interest whenever the metric is invariant with respect to some 
vector field. A Killing vector field is defined to be a vector field V such that 
4 {Ρο = 0. (3.36) 
It can be deduced that the component form of this equation in a coordinate sys- 
tem is (cf. equation (3.14)) 
axl V® + gp; ne ve = 0. (3.37) 
It is often convenient to use a coordinate system in which the integral curves of 
V are one family of coordinate lines, say for the x! coordinate. Then, from 
exercise 3.6 we find 


ὃ 
(£79) = y* axk ὃν T Sip 


ὃ 
(£70) = axl ou = 0, (3.38) 


and so the metric components are independent of the coordinate x’. Conversely, 
if there exists a coordinate system in which the components of the metric are 
independent of a certain coordinate, then the basis vector for that coordinate is 
a Killing vector. This is often a convenient way of identifying Killing vectors. 

As an example, let us find the Killing vector fields of three-dimensional 
Euclidean space. The metric in Cartesian coordinates has components 


which is independent of x, y, and z. Therefore 0/0x, 0/dy, and 9/97 are Killing 
vectors. The same metric in spherical polar coordinates has components 


_~ 9,9 _ 
Srr Or ar 
0 ὃ 
See = ο δρ r?, (3.40) 
ὃ 6 O 
Soo = arta = sin’ s. 


dd ὀφ 
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Therefore 0/0¢ is a Killing vector: J, in fact. Clearly, J,, and 1, will also be Killing 
vectors. These six Killing vectors turn out to be a basis for the Lie algebra of 
Killing vector fields. We shall prove this in chapter 5, part E, where we undertake 
a more thorough study of spaces with high symmetry. 


3.12 Killing vectors and conserved quantities in particle dynamics 

It is well-known in classical mechanics that if a force is the gradient of a 
potential which is axially symmetric, then the angular momentum of a particle 
about the axis of symmetry is constant on the particle’s trajectory. Similarly, if 
the potential is independent of one of the Cartesian coordinates, say x, then the 
x-component of momentum is conserved. However, it is not often remarked that 
if the potential has some other sort of symmetry (constant, say, on a family of 
similar ellipsoids) then there is not a conserved momentum associated with that 
symmetry. That is, conserved quantities in particle dynamics do not follow 
simply from invariance of the potential under some motion (circular, linear, or 
elliptical in our three examples), but also require that the motion be along a 
Killing vector field of the Euclidean space in which the dynamics takes place. 
Although we do not yet have quite enough mathematical machinery to prove 
this assertion (deferred to exercise 5.8), its reasonableness can be seen from the 
equation of motion. Written in ordinary vector calculus notation, this is 


mV = —V®,ormV' = —V'®. (3.41) 
But with our understanding that the vector gradient involves the metric, we 
know that this is really 
τῷ 
Ox? 
Any invariants derived from this equation clearly must involve not only the 
invariance of ® but also of ϱ|. 


mVi = —gi (3.42) 


3.13 Axial symmetry 

To illustrate the natural way in which Lie derivatives enter problems 
with symmetry, we consider the case of axial symmetry. Axial symmetry is 
invariance under rotations about some fixed axis. (It should not be confused 
with cylindrical symmetry, which has the added assumption of invariance under 
translation along the axis of symmetry.) Let the angle about the axis be @. Situ- 
ations Often arise in which a problem has a certain ‘background’ axial symmetry. 
One may be dealing with a particle orbiting in an axially symmetric potential, or 
with small perturbations of an axially symmetric system. In such a case, one 
gets a linear equation for the unknown y, 


Ι{ν) = 0, (3.43) 
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where L is some operator which is independent of the coordinate transformation 
¢ > + const. Solutions to (3.43) are not necessarily axially symmetric: the par- 
ticle is at one angle at one time and another angle a moment later, or the pertur- 
bation has nonaxisymmetric initial values. But scalar solutions do have the nice 
property that, when Fourier-analyzed in ϕ as 


VG.) = στ vmbeiyelr?, (3.44) 
m=-0 

the functions W,,(x’) (the index j runs over all coordinates but ϕ) satisfy the 
related differential equation 

0 = L, (Wm) = eU"PL(v ne”). (3.45) 
The operators L and L,,, are not usually identical because Z can contain deriva- 
tives with respect to ¢, but L,, must not. For example, consider the operator 
1a ,20 1 0.9 1 0? 
Par” ὃν sind 260” 36 rin? 6 96’ 
which is clearly unchanged by the transformation ¢@ > ¢ + const. Then when it 
im® 





Vv? = 








operates on a function f(r, ϐ) ο) ο it gives 
, 10,0 1 oO ὃ 
V2 6 ΙΥΙΦΥ ..αἰπφί - 2. --«ἱῃθ-- 
(1G, Dem) =e ε ar ὃν γ2πθὸθ a0 
2 
τος 6). (3.46) 


The operator in curly brackets is V2,, as defined in (3.45). This Fourier decom- 
position of the function fis not usually useful in the case of particle motion, 
where the particle’s position is a delta function in ¢, but it is very helpful for 
continuous systems, like waves on an axially symmetric background. The key 
functions, οἱ". may be called scalar axial harmonics. 

We say that a solution ψ to (3.43) has axial eigenvalue m if 


4 ἐοιν = imy, (3.47) 


where €g = 0/0¢ is the tangent to the circles of symmetry. None of this is diffi- 
cult if is a scalar function, but suppose one is dealing with a vector equation, 
such as the one for the vector potential of electromagnetism. It will again be use- 
ful to have axial harmonics here, but they must be vector axial harmonics. We 
proceed to construct these. 

Consider the submanifold ¢ = 0 (really a submanifold with boundary, this 
being on the axis of symmetry). At each point choose a basis {e;} for vectors 
tangent to the submanifold. Supplement this basis by eg so that {6ρ. 61) is a basis 
for all vectors tangent to the manifold at the points of the submanifold. Now 
generate a basis for the entire manifold by Lie dragging this basis along ég all the 
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way around the axis of symmetry, as shown in figure 3.8. The resulting fields all 
satisfy 
£556) = Q, (3.48) 

i.e. they are all axially symmetric. Notice that in conventional Cartesian coordin- 
ates the components of e; change on going around the axis. Axial symmetry for 
a vector field does not mean that its Cartesian components are independent of ϕ, 
but rather that its components in a coordinate system that includes ¢ are 
independent of ¢. 

We now have a basis which has axial eigenvalue 0, by equation (3.48). Clearly, 
a basis which has axial eigenvalue m is 


Emi = GO", δω = Ep 1. (3.49) 
Any vector field satisfying 

£5 oY = imV 
can be expressed as a linear combination of the vector axial harmonics of eigen- 
value πι given in (3.49), which coefficients which are independent of ϕ. 


Exercise 3.9 

In Euclidean three-space, construct axial vector harmonics for rotations 
about the z-axis by choosing the basis in the plane ¢ = 0 to be {é,, δε}. 
Find the Cartesian components of the three vector harmonics for m = 2. 
In a similar way, find the basis one-form axial harmonics for m = 2, 
beginning with {dx, dz} in the plane ¢ = 0. If fis a scalar function of 
axial eigenvalue 2, show that the gradient, d f, is a one-form of axial 
eigenvalue 2. Show that ες + ie, has axial eigenvalue + 1. 


Although we have not yet exploited it, there is clearly a close relationship 
here with group theory. The existence of axial symmetry means that the 


Fig. 3.8. View down the axis of symmetry of a basis (δρ, δη) formed by 
Lie dragging along ég. 
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‘background’ physical situation is invariant under Lie draggings along 0/d¢. 
These draggings form a Lie group, as described in §3.1. The group involved, 
SO(2), is particularly simple. The more important example of the rotation group 
(whose associated symmetry is called spherical symmetry) is more complicated 
because the various Lie derivatives along ᾖ.. ly, and 1, do not commute. We can 
deal with this problem only by studying Lie groups themselves systematically, 
and that will occupy the rest of this chapter. 


3.14 Abstract Lie groups 

We have touched on Lie groups or Lie algebras several times. Now we 
shall study them more systematically. The main reason they are interesting in 
physics is, as we have seen, that they express the invariance properties of 
important tensors. We will explore that aspect in later sections. Here our inten- 
tion is to study the group manifold itself. This is an important distinction which 
must not be blurred: the group manifold is quite separate from whatever mani- 
fold contains the tensor whose invariance properties the group expresses. The 
manifold of all rotations (SO(3)) is different from the manifold whose coordin- 
ate systems are rotated (Ε 2). 

Let us assume we have a finite-dimensional Lie group, i.e. a C™ manifold G 
of dimension n, which has the following C™ maps (diffeomorphisms): any 
element g of G maps h ΙΓ» gh (left translation by g) or h +> hg (right translation 
by g). We do not assume the group is Abelian (i.e. hg # gh in general), and we 
shall denote the identity element by e. Any neighborhood of e is mapped by left 
translation along a particular g onto a neighborhood of g, as shown in figure 3.9. 
Because the map carries curves into curves it maps tangent vectors at e (elements 
of Τε) to those at g. This is a map called Lz: Τε > Tg, which is also illustrated in 
figure 3.9. (The concept is the same as for the Lie dragging map, §3.3.) A vector 


Fig. 3.9. The left translation along g maps a neighborhood of e onto 
one of g. There is a natural map of a vector at e to one at g. 
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field V on G is said to be left-invariant if L, maps V at e to V at g (L,: Ve) 
+> V(g)) for all g. By the group composition law it follows that L, maps V(h) 
+> V(gh) for any hin G, so that what we have is a natural definition of a ‘con- 
stant’ vector field on G. It is also clear that each vector in T, defines a unique 
left-invariant vector field, so that the left-invariant vector fields form an n- 
dimensional vector space. (As in 53.10, linear combinations of these fields use 
constants, not functions on G.) In fact it is easy to see (figure 3.10) that if V and 
W are any two left-invariant vector fields, then L, maps [V, W] ateto [V, W] at 
g: the field [V, W] is also left-invariant. (The reader who is not convinced by the 
diagram is invited to use coordinates on G to prove the result.) This is important, 
because it means that the left-invariant vector fields form a Lie algebra. This is 
called the Lie algebra of G, denoted by &(G). (Some authors use g.) This Lie 
algebra is completely characterized by its structure constants ci,, defined as 
follows. Let {Vjy,i=1,...,n}bea basis for the Lie algebra, a linearly inde- 
pendent set of left-invariant vector fields. (If they are linearly independent at 
one point, say e, the map shows they are independent everywhere.) Then we 
can always write 

[Vinys Van) = chiViiy (3.50) 
(summation convention assumed). If all the structure constants vanish, the Lie 
algebra is said to be Abelian. We shall see that it implies G is also Abelian. 
Naturally the basis {Vy,)}is not unique, and under a change of basis the numbers 
cj.) transform as components of a (4) tensor. Every Lie group and algebra has a 
unique “structure tensor’ C. There is a limited converse to this, that a given set of 
structure constants ‘almost’ determines the Lie group whose Lie algebra they 
embody. This will be discussed in §3.16 below. 


Fig. 3.10. The mapping of figure 2.21 for left-invariant vector fields. 
Because they are left-invariant, translations by parameter distance ε 
near 6 map into the same ones near g, and so the ‘gap’ near e that repre- 
sents their Lie bracket is mapped to that near g, which is the bracket of 
the translated fields. 
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Consider the integral curve of a left-invariant vector field V which passes 
through e. It has a unique tangent vector V, at e and a unique parameter ¢ for 
which e corresponds to t = 0. As in 92.13, the points on the curve may be 
located by exponentiation of V, exp (tV). This just involves the diffeomorphism 
of G onto itself generated by V, as discussed in §3.1. Unlike an arbitrary vector 
field, V is determined completely by V,, so we can denote the points of G on 
this curve by 


εν, (ϐ) = exp(tV)|,. (3.51) 
Because exponentiation has, by definition, the property 

exp (t,V) exp (t;V)le = exp [(t1 + 42) V le; 
the points on these integral curves form a group: 

gv, (ti +t) = exp [(ti + t,V]le = exp(t,V) exp (ει Γιο 

= gv7,(t2)8v,(t1). (3.52) 

This is called a one-parameter subgroup of G. It is obviously always Abelian, 
gv(ti + t2)=8y,(t2 + 11), simply because the group operation corresponds to 
addition of parameter values. To each vector in Τε there corresponds a unique 
subgroup. Moreover, since every one-parameter subgroup must be a C™-curve in 
G which passes through e (a subgroup must always contain the identity element) 
there is a one-to-one correspondence between the one-parameter subgroups of G 
and the elements of the Lie algebra of G. 


Exercise 3.10 

Define right-invariant vector fields. Show that they form a Lie algebra. 
Show that their integral curves through e coincide with those of the left- 
invariant fields. Show that their integral curves through other elements 
do not coincide with those of the left-invariant fields in general unless 
the group is Abelian. 


Exercise 3.11 

Show any basis {V;(e),i=1,...,n}for Τε defines a linearly indepen- 
dent set of left-invariant vector fields, which we shall call {V;}. 

Consider the tangent bundle of the group, 7G. In some neighborhood U 
of e adopt coordinates for it as follows. Let X be a vector at point g of 
U, with X = Y;a;V;(g). The fiber at g is just R”, so take the coordinates 
of X to be {a;}. Let the coordinates of TG over U then be {{coordinates 
of g}, {a;}}. Show that this prescription extends to all of TG in such a 
way as to prove that Τζ has a 1-1 map onto G x R", i.e. that the 
tangent bundle of a Lie group is trivial. 


oN 
99 
— 
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3.15 Examples of Lie groups 

(i) The simplest example is R” , which is a manifold and a group under 
vector addition. This is an Abelian group. The one-paramenter subgroups are the 
‘rays’ (straight lines through the origin). The left-invariant vector fields are 
parallel to the rays, so they all commute. The Lie algebra is thus the vector space 
T, equipped with the trivial Abelian bracket: [V, W] = 0 for all Vand W in T,. 

(ii) For physics, one of the most important Lie groups is the group of all 
n Xn real matrices with nonvanishing determinant, called GL(n, R) or the 
General Linear group in” Real dimensions, which is a Lie group for the follow- 
ing reasons. First, it is a group with the operation of matrix multiplication, the 
unit matrix being the identity element. (The restriction to nonvanishing deter- 
minant is necessary to ensure the existence of an inverse element for any matrix.) 
Second, it is a Lie group because it is a manifold. Any matrix A in GL(n, R) with 
entries {a' ,, i,j =1,...,n}has a neighborhood of radius ε defined as those 
matrices B for which |b'; —a',| <¢ for all i andj, and ε can be chosen small 
enough so that every B has nonvanishing determinant. The numbers 
x', = b'; —a'; are coordinates for this neighborhood, and as there are n” of them, 
all independent, the dimension of GL(n, R) is ή”. In fact it is a submanifold of 
R”’, Since κ. like any R™, is identical with the tangent space of any of its 
points, the tangent space of the identity e of GL(n, R) is Α΄ and any tangent 
vector can be represented as a matrix. For instance, the curve in GL(n, R) com- 
prising the matrices diag (1 + exp (A), 1,1,..., 1), which has parameter A, has 
tangent diag (1,0,0,...,0) at A =0. This matrix has zero determinant, illus- 
trating the fact that any matrix is in T, and therefore any matrix generates a 
one-parameter subgroup, a left-invariant vector field,’ and an element of the Lie 
algebra. 

The one-parameter subgroup generated by any matrix A is the integral curve 
through e of the left-invariant vector field whose tangent at e is A. If we denote 
these matrices by g(t) with dg,(t)/dtl) =A (which simply means d(g,)';/dtlo 
= a', for all i, 7), then by (3.52) we have 


ga(t + At) = ga(tga(Ad) 


=> dg,(t)/dt = ga(t)A (3.53) 

=> g,(t) = exp (14) (3.54) 
] 1 

= L+tA+ VA + PAT +... (3.55) 


Equation (3.55) is the definition of the exponential of a matrix, and with (3.54) 
gives concreteness of the formal expression (3.51). So the one-parameter 


υ The reader should bear in mind that a vector tangent to G is in fact a matrix, not 
to be confused with a ‘column vector’, which plays no role here. 
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subgroups of GL(n, R) are the exponentials of arbitrary n x n matrices. The 
matrix A is often called by physicists the infinitesimal generator of the subgroup 
g(t). Exercise 3.12 explores properties of exp (fA). 


Exercise 3.12 
(a) Show that (3.55) satisfies (3.53). 
(b) Show that (3.55) implies 
exp (B''AB) = B exp (A)B. (3.56) 
(c) It can be shown (see Hirsch & Smale, 1974) that for any real matrix A, 
a real matrix B can be chosen so that B™'AB has the following canonical 


form (called block-diagonal form, since the nonzero elements fall in 
square blocks along the main diagonal) 





P, 0ο ο 
Ρ 148 Ξ{ 0 2 0 (3.57) 
ϱ O P3 
where each P; is a square matrix having one of the forms 
(i) P; isa 1 x1 matrix 
(1), (3.58a) 
or (ii) P; is a 2 x 2 nondiagonal matrix given by 
| a4 | (3.580) 
SG 
or (iii) P; is an n; x nj nondiagonal matrix (η! = 2) given by 
uy 1 0 ο 0 
ο tl. 0 0 
ο ο wm. 1 ο (3.556) 
ο. ο ο . wl 
0 ο ο | 0 µι 


Moreover, the numbers Aj, uj, and 7; + is; are the eigenvalues of A. 
Show from this and (3.55) that exp (18148) similarly has block- 
diagonal form with the corresponding blocks: 


(i) (3), (3.59a) 


il cos fs; sin ts; 
uD efi 7 7), (3.59b) 
— SIN tS; σος tS; 
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(iii) i, 1 
1 t 21 f 31 t 
1, 
ο 1 t af 
ef Hj 
ο O 1 t (3.59c) 
0 0 1 


It follows from (a) that a matrix B which puts A into canonical form 
also puts exp (tA) into canonical form in cases (i) and (ii), but that the 
transformation to canonical form in (iii) is a function of 1. 


Note that not every element of GL(n, R) is a member of a one-parameter sub- 
group. One reason is that such a subgroup is a continuous curve in GL(n, R), on 
which the determinant must change continuously. Since the determinant is 1 at 
e and cannot be zero, there is no continuous curve linking e to a matrix with 
negative determinant. (The reader can easily see that (3.59) represents only 
matrices with positive determinants.) This is an example of a disconnected group 
and illustrates the interesting global properties Lie groups can have: one does 
not usually learn everything about a Lie group just by studying its one-parameter 
subgroups or even its Lie algebra. Those elements which can be joined to e by a 
continuous path (not necessarily a one-parameter subgroup) are called the 
component of the identity of the group. 


Exercise 3.13 
Show that the matrix 


—] 1 
0 —] 
is in the component of the identity of GL(2, R), but is not in any one- 


parameter subgroup. (Hint: construct a continuous path joining it to 


e=(9 1).) 


What is the Lie algebra of GL(n, R)? Given a tangent vector A, at e and its 
one-parameter subgroup g 4, (t), the left-translation fg Ae (8) of this curve by any 
matrix f of GL(n, R) produces a curve of the congruence of the left-invariant 
vector field corresponding to Ας, as in figure 3.11. This is how A, generates its 
left-invariant field, which we call simply A. If in fact fis on the curve 3,(¢) 
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through e generated by any matrix B, in T, then the Lie bracket of the two 
vector fields at e, [A, Β]|.. is by (2.12) just 


lim 2 lea, (9)Ε5, (2) — 28,22, (01, 


which is easily evaluated using (3.55): 
[A, Bll. = A.B. —B,Ae.- (3.60) 

That is, the Lie bracket of any two left-invariant vector fields at e in GL(n, R) is 
just the ordinary matrix commutator of the two matrices which generate the 
fields. The left-invariant vector field generated by this commutator is the 
element of the Lie algebra &(GZ(n, R)) which is the bracket of the original fields. 

(iii) We have seen that the rotation group is a Lie group (§2.3(vi)). We will 
study it closely below, but here we examine it as a subgroup of GL(n, R). In 
§2.29 we saw that the matrices A for which 41 = A? are elements of the 
Euclidean symmetry group O(n). (The symbol O(n) means the Orthogonal group 
in n dimensions.) Since the determinant of any matrix obeys the rules ($1.6) 


det (4) = 1/det(A™'), det (A) = det (4%), (3.61) 
matrices in O(n) have determinant +1. Those with determinant +1 form a sub- 
group called SO(n) — the Special Orthogonal group — and we shall now demon- 
strate that this is the group of rotations. (The matrices in O(n) which have deter- 


minant —1 are not a subgroup since they do not include the identity matrix. 
Like GL(n, R), O(n) is disconnected.) 


Exercise 3.14 

(a) Show that if A is in O(n) its eigenvalues equal those of 4 1. (Use the 
fact that det B = det B’ for any B.) The reader not experienced with 
eigenvalues should look up their definition in 51.6. 


Fig. 3.11. Left-translation of £a,(t) by f. 
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(b) Show that for any nonsingular matrix A the eigenvalues of A are the 
reciprocals of those of 4-1. (Use the fact that det (AB) = det A det B.) 
Conclude that there are two types of eigenvalues (A,,...,A,,) of A 
in O(n): either (i) A; = +1 or (ii) AJA, = 1 for 7 #K. Show that case (ii) 
implies the eigenvalues come in pairs (e’”, e??) for real ϐ. 

(c) It can also be shown that the canonical form of a matrix A in O(n) can 
be achieved by a transformation B'AB, where B is a matrix in SO(n). 
Use this to conclude that a matrix in O() has canonical form consisting 


of blocks 
(i) (1), (3.624) 
or (ii) (-- 1), (3.62b) 
cos sin ϐ 
or (iii) | (3.62c) 
—sin@ cosé 


(d) Show that the Lie algebra of O(n) consists of all antisymmetric matrices. 
Show from this that O(n) has dimension $n(n — 1). 


Now, a matrix A in GL(n, R) may be regarded as an invertible (1) tensor on 
ΚΙ , mapping a column vector V of R” to AV, obtained by matrix multiplication. 
The transformation 8 148 is nothing more than the transformation of the com- 
ponents of this tensor (§2.26) when the basis {é;,..., κ} is transformed to 
{Bé,,...,B 'é,}. We can therefore take the view that any matrix of SO(n) is 
equivalent to successive rotations in independent two-dimensional planes, since 
the canonical form (3.62c) obviously does that, while the form (3.62b) must 
occur an even number of times (for the determinant to be positive), which direc- 
tions can be paired to give (3.62c) with 9 = 7. Thus SO(”) is indeed the rotation 
group. (Note that if n is odd, every matrix in SO(n) fixes at least one direction.) 
The remaining matrices of O(m) can be interpreted as inversions, transformations 
which change the ‘handedness’ of any set of n linearly independent vectors. This 
is shown in the next exercise. (‘Handedness’ of a basis will be discussed in detail 
in chapter 4.) 


Exercise 3.15 

Show that the canonical form of an element of O(n) not in SO(n) is the 
product of a matrix diag (1,...,1,—1,1,..., 1) (having only one 

— | onits diagonal) with the canonical form of a matrix in SO(n). 
From this prove that it is an inversion. 
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Exercise 3.16 

Prove that any matrix in SO(n) is in a one-parameter subgroup. Prove 
that any matrix in SO(3) is equivalent to a single rotation through a 
finite angle ϐ about some axis. 


Before leaving the rotation group we need to study its Lie algebra, at least for 
SO(3). The vector space 7, is that of all antisymmetric matrices, which has 
dimension three (see exercise 3.14(d)). A basis consists of the matrices 


ο ο ο 0 ο 1 
ιξ- 40 0 —-1] . L,= 0 ο OF , 
ο 1 0 -ἱ 0 ο 
(3.63) 
ο -ι ο 
L3 = 1 0 0 
0ο ο 0 
Exercise 3.17 
Show that this Lie algebra basis has the brackets 
[L1,L2) = £3, 1, 1] = Li, να, νι] = 11. (3.64) 


We will come back to this algebra shortly. 

(iv) Another matrix group of interest in physics is SU(”), which stands for 
Special Unitary group in n dimensions. This is a subgroup of GL(n, C), the group 
of all complex ή x n matrices of nonvanishing determinant (the General Linear 
group in nm Complex dimensions). Since each entry may be complex and each 
complex number is defined by two real ones, GL(n, C) has 2n? (real) dimensions. 
Its subgroup U(n) is the Unitary group, each element U obeying U' = 03, 
where * denotes the complex-conjugate transpose (Hermitian conjugate). By 
analogy with O(n), its Lie algebra consists of all n x n anti-Hermitian matrices. 
(A matrix A is anti-Hermitian if A* = — A.) This has η” real dimensions, since 
such a matrix can have $n(n — 1) arbitrary complex off-diagonal elements (given 
by (η — 1) real numbers) and n arbitrary pure-imaginary diagonal elements 
(contributing ή real dimensions, making n? in all). Its subgroup SU(n) is the set 
of all matrices in 01) with unit determinant. Since the determinant of any 
element of U(n) is real, this is one extra condition, so SU(n) has dimension 
n* — |, Its Lie algebra is the set of all anti-Hermitian matrices with zero trace. 
(The trace of A is the sum a';: see §1.6.) 


(a) 


(b) 


(c) 


3.16 
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Exercise 3.18 

Show that the Lie algebra of SU(n) is that of all anti-Hermitian traceless 
matrices. You may use the fact that any element of U(n) has canonical 
form diag (οἱ δι, e!?2,.. . , etn) where the numbers {@,fj=1,...,n} 
are real. 


Exercise 3.19 
Show that the following matrices are a basis for T, of SU(2): 


yr ai(0 i) ,_4fo -ἡ 1 ο 6 65) 
στ ο OF ο οι ο) 21ο i! 


Show that this is a three-dimensional real vector space: even though the 
matrices may contain imaginary numbers, only linear combinations of 
them with real coefficients remain in the vector space Τε of SU(2). 
Show that this Lie algebra basis has the brackets 


[J1,J2] = J3, 2. 19] -- Ji, [J3,J1] = 02. (3.66) 


These are formally identical to (3.64), and we shall see in the next 
section that this implies an intimate relation between SU(2) and SO(3). 


Exercise 3.20 

Let tr (A) =a’, denote the trace of any matrix A. Prove that tr (8148) 
= tr (A). 

Use this and (i) the fact that the determinants of matrices obey the rule 
det (AB) = det (A) det (B), (ii) the result (3.56), and (iii) the canonical 

forms (3.55-59) to prove that for any matrix A 

det (exp (A)) = exp (tr (A)). (3.67) 


Use this to give an easier proof of exercise 3.18. 


Lie algebras and their groups 
Every Lie group G has its Lie algebra (G). Since every element g of G is 


the image of e under the left-translation g generates, and since very vector in T, 
corresponds to a unique vector field in the Lie algebra, it follows that every 
point g of G is on one curve of each of the left-invariant congruences. Is it poss- 
ible, then, to construct the group G entirely from a knowledge of its Lie algebra? 
The answer is a partial yes, but to phrase it we must first give a better definition 
of a Lie algebra than we have so far been working with. 

A Lie algebra is a real vector space V upon which is defined a bilinear multi- 
plication rule called [ , ] which produces from any two vectors A and B 
another vector [A, B] satisfying: 
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(i) [4,8] = —[B, 4], (3.68) 

(ii) [A, [B, C]] + [B, [C, A]] + [C, [A, Β]] = ο. (3.69) 
The crucial difference between this definition and the one in §2.14 is that the 
Lie bracket is defined formaily, i.e. by its properties (i) and (ii), so that any rule 
for combining vectors in this manner is acceptable. The commutator of vector 
fields provides one such rule, and this was the only one used until now. But 
clearly another example is the vector space R® with the usual cross-product: 


ία, δ] =axb. (3.70) 


Exercise 3.21 

(a) Verify that (3.70) satisfies the Jacobi identity (3.69). 

(b) Show that the basis 6! = (1, 0, 0). e, = (0, 1, 0), e3 = (0. 0, 1) has the 
brackets 


[ει, 61] 9. [1, 63] = ἐι, [€3,é1] = 61. (3.71) 


Compare these to (3.64) and (3.66). 


We can now state but not prove a theorem which is of fundamental import- 
ance to physics, that behind every Lie algebra there is a group. Precisely, every 
Lie algebra is the Lie algebra of one and only one simply-connected Lie group. 
(A manifold is simply connected if every closed curve can be smoothly shrunk to 
a point. See Spivak (1970) or Warner (1971) for discussions and partial proofs of 
this theorem.) Moreover, any other Lie group with the same Lie algebra but not 
simply connected is covered by the simply-connected one. (A connected mani- 
fold M covers another N if there is a map 7 of M onto N such that the inverse 
image of some neighborhood V of any point P of N is a disjoint union of open 
neighborhoods of the points in 77'(P) in M. An example is given in figure 3.12.) 
The covering must be a homomorphism of the two groups. (See 81.4 for a defi- 
nition of a homomorphism.) 

The groups SO(3) and SU(2) illustrate this theorem nicely. First we shall 
show that SU(2) is simply connected. We do this by considering the set H of 
matrices of the form 


a b 
[*; | (3.72) 


for arbitrary complex a and b, bars denoting complex conjugation. 


Exercise 3.22 
(a) Show that H — {(@ 8)}, the subset of Η with non-zero determinant, is a 
group under multiplication, hence a Lie subgroup of GZ(2, C). 
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(b) Show that Η is a real vector space (using matrix addition), has dimen- 
sion 4, and has a basis consisting of J;, /,, and J3 of exercise 3.19, plus 
the matrix J = (4 9). 
(ο) Let 4 be any matrix in H: 
A = 2a,J1 + 20,72 + 2444 + aal, 
with {a,} real. Show that A is in SU(2) if and only if 
at +02 +02 +02 = 1. (3.73) 
(d) Show from this that the group SU(2) has a 1-1 mapping onto the three- 


sphere S*, which is a simply connected manifold. (We say that S* and 
SU(2) are diffeomorphic.) 


We must next find a mapping 7: SU(2) > SO(3) which is a multiple covering. 
We can construct it easily by exponentiating the elements of the Lie algebra. In 
SU(2) the element J, has exponential 


C0 sae 3 
“GI. a) 


exp (6011) 


cos(t/2) isin (t/2) | (3.74) 
isin(t/2) cos (t/2) 
The element LZ, of SO(3) has the exponential 
1 O QO 0 ο 0 
exp(sL;) = [0 1 οἱ του -1 
0 ο 1 ο ι ο 
| 0 0 0 | 0 0 0 
+—sto0 -ι ο +— 5} 0 Ι 1 
2! 3! 
0 0 —] ο -1 QO 
1 0 0 
= O ος -—sins] . (3.75) 


O sins  coss 
If we simply establish the natural correspondence suggested by the algebra, 
1 0 0 


QO cost —sinf] , 


st isin4t 
π: SU(2)>SO(3), 7: oad psi | 


1σίπΣί οοςΣί 0 sint cost 


(3.76) 
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then it is clear that this is a homomorphism of the two one-parameter subgroups, 
and it is also clear that the two elements { and t + 27 of SU(2) have the same 
image in SO(3). Moreover, { + 4nz for any integer n is the same point of SU(2) 
as t, so we have proved that exp (t/,) is a double covering of exp (sL,). We can 
generalize this to the whole group: the map 

tr exp (t,J, + thJ, + t3J3) exp (¢,L, + t,L, + t3L3) (3.77) 
is a double covering of SO(3) by SU(2). 

Since we know that SU(2) has the global topology of the three-sphere, this 
double covering enables us to discover the topology of SO(3). The one-parameter 
subgroup exp (tJ,) of SU(2) begins at e with t = 0 and returns to e at t = ἀπ. In 
figure 3.13 this is shown as a great circle around ιδ”. (But bear in mind that we 
have not put a metric on SU(2). Only the global topology is relevant, not the 
actual distance relations.) The points labelled { and { + 27 are diametrically 
opposite one another. In order to make them the same point of SO(3), we simply 


Fig. 3.12. The unit circle S' is covered by the real line R! an infinite 
number of times by the map 7: R' > S' which takes x to the point on 
S' whose coordinates in the plane R* are m(x) = (cos x, sin x). The set 
na ‘(V) is the union of all the open intervals shown on R. 


s} by 


R} 
-4π -2π 0 2xn 4x ὄπ 


Fig. 3.13. A two-dimensional slice of S$? containing the one-parameter 
subgroup exp (t/,) of SU(2). The group SO(3) is the top half of the 
sphere, with points on opposite ends of diameters identified with each 
other. 
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identify SO(3) as the top half οἱ’, with points on opposite ends of a diameter 
through the equator (e.g. t = 7 and t = 3π) identified. This half of S° with these 
identifications is no longer simply connected. A curve such as @ can be shrunk 
smoothly to a point, but the curve of the subgroup exp (tZ,) cannot be, since 
the two ends of the diameter of the equator cannot be brought together: they 
are always diametrically opposite one another. This construction also makes 
clear the fact that SO(3) and SU(2) are identical in some neighborhood of e. It is 
for this reason that their Lie algebras are the same. This happens for any two 
groups with the same Lie algebras. 

To what group does the Lie algebra of equation (3.70) correspond? This is 
entirely a matter of interpretation. As an abstract algebra it corresponds to both 
groups. As a relation among vectors in R® it is most common to associate it with 
SO(3) by saying that to the subgroup exp (01.1) (rotation by an angle 6 about 
the x-axis) there corresponds the ‘curve’ in R® , exp (02,) (a vector along the 
x-axis of length 0). This association of a rotation with a vector is very familiar to 
physicists, even more so in its time-differentiated version associating a rate of 
rotation with an angular velocity vector. This convenient identification is an 
accident of three dimensions: the group SO(4) has dimension 6 while the vector 
space R* in which it acts has dimension 4, so no such identification is possible. 
But to return to R*, we can equally well identify κ with SU(2) in a similar 
fashion. In §3.18 we will see that this permits us to associate the spin of a par- 
ticle with a vector in R* even though the spin is not an element of T,, for any P 
in R?. 

Before leaving Lie algebras, we must remark that we can now show that an 
Abelian Lie algebra is the Lie algebra of an Abelian group. An n-dimensional 
Abelian algebra is simply a vector space, and it is the algebra of the Lie group 
R" , as discussed in §3.15. Since R” is simply connected, any other Lie group 
having this algebra must be covered by R” and must be identical to it in a neigh- 
borhood of the origin e. Since R” is Abelian (V + W = W + V), so is any other 
Lie group with an Abelian Lie algebra. 


3.17 Realizations and representations 

It is usually best to regard any group as an abstract group, defined 
entirely by the group operation and, for Lie groups, by the manifold structure. 
Thus, SO(3) as an abstract group is simply a certain three-dimensional manifold 
with a rule associating a product point gh with any two points g and h, the rule 
obeying the usual group axioms. To a physicist this abstract structure is not the 
aspect of group theory that is of most interest. More important is what the 
group acts upon and how it affects it. Again, SO(3) is important because we 
associate with each point of it a rotation of our three-dimensional space. Such an 
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association is called a realization. A realization of a group G is an association 
(map) between any element g of G and a transformation Τ(6) of some space M 
in such a way that the group properties are preserved: (i) T(e) = J, the identity 
transformation (no change of M);(ii) T(g™*) = [T(g)]" ; Gili) T(g) ο T(h) = T(gh). 
The realization is faithful if the association is 1-1: T(g) # T(h) if g #h. If Misa 
vector space and every 7(g) is a linear transformation (a (1) tensor on that vector 
space) then the realization is called a representation. A few examples may help 
to make these ideas clear. 

(i) Consider the effect of a rotation on the unit sphere S” given by the 
equation x” + y? +z? = 1 in ΚΣ. Suppose we rotate by an angle 6 about the 
x-axis. This consists of mapping any point on the sphere whose coordinates are 
(x,y,z) to one whose coordinates are (x’, y’, z') as follows 


/ 
x =X, 


y' = ycosé —zsiné, (3.78) 
z = ysind +zcos@, 


which is still on the sphere since (x’)* +(y')* + (z')? = 1. This transformation 
is associated with the group element exp (6L,) of SO(3), in the notation of 
(3.63). To any element of the group there corresponds some transformation of 
S* into itself. Since S? is a manifold but not a vector space, this is a realization 
of SO(3). On the other hand, the same transformation (3.78) can be regarded as 
a map of R® into itself, not just of S? into itself. Since R? is a vector space, this 
is a representation of SO(3) in terms of matrices which transform vectors of R° 
into other vectors. These matrices are nothing more than the matrices we used to 
define the group SO(3) in the first place. This illustrates a subtle but useful point 
of view. It is typical for a group to be defined in the first place by a (faithful) 
realization or representation, because this enables one to study all its properties 
concretely. Afterwards, however, it is more useful to regard the group as abstract 
because there may be other useful representations or realizations that one had 
not been aware of at first. We will illustrate these for the rotation group separ- 
ately in the next section. 

(ii) Every group has at least two faithful realizations: the left and right trans- 
lations of itself. Any group element g defines a transformation of G which maps 
any h to gh (the progressive or principal realization) and one which maps h to 
hg” (the retrograde realization). 

(iii)" The matrix groups that we have studied — GL(n, R), O(n), SO(n), GL(n, 
C), U(n), SU(n) — have all been studied through their faithful representations as 
n Xn matrix transformations of n-dimensional real or complex vector spaces. But 


T This example may be regarded as supplementary material. 
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each Lie group G has another representation as linear transformations on its own 
Lie algebra. This is called the adjoint representation, and is defined as follows. 
Consider first the map of G into itself given by J,:h t> ghg™'. This is the group 
adjoint realization of G consisting of left-translation by g and right-translation 
by g -. (It is not necessarily faithful: if G is Abelian then J g is the identity map 

h +h for all g.) This realization is called the inner automorphisms of G. Notice 
that each J, maps the identity e into itself, so that every curve through e is 
mapped into a (possibly different) curve through e, as shown in figure 3.14. 
Therefore J, induces a map of any tangent vector in T, to another one in Τε. 
This map is called Ad,, the adjoint transformation of T, induced by g. Now, if 
the solid curve in figure 3.14 is a one-parameter subgroup, say exp (tX) where 

X is in Το, then so is its image under J,, since g( fh)g' = (gfg™')(ghg™'). It 
follows that the dashed curve in figure 3.14 is the one-parameter subgroup gener- 
ated by Ad,(X), 


I,[exp (tX)] = exp [tAd,(X)]. (3.79) 
Now if g itself is a member of a one-parameter subgroup g(s) = exp (sY) there 


should be a natural expression for Ad,(X) in terms of Y. This is provided by the 
next exercise. 


Exercise 3.23 
Show that 


Adgs)(X) = exp (s£s)X. (3.80) 


Fig. 3.14. What happens to curves through e under the map h b> ohg Ἱ. 
shown first as the map gh followed by the map gh b> ghg!. The 
identity e is mapped into itself but points 4 and f near it are generally 
changed, so that a tangent vector at e is mapped into another one. 
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3.18 | Spherical symmetry, spherical harmonics and representations of the 

rotation group 

We have discussed Killing vectors and their relation to symmetries of 
Euclidean space. We can now make all of these notions precise by concentrating 
on the example of spherical symmetry. A manifold M with a metric tensor ϱ| is 
said to be spherically symmetric if the Lie algebra of its Killing vector fields has 
a subalgebra (i.e. a subspace whose brackets remain in the subspace) which is 
the Lie algebra of SO(3). We have to speak of a subalgebra because g| might have 
more symmetries, and here we will only consider those having to do with its 
spherical nature. The reader should note that it may be wrong to say M is spherical 
‘about some point’, because the ‘centers’ of the spheres may not be in M (see 
figure 3.15). Our definition is intrinsic: the Lie subalgebra concerns vector fields 
of M itself. In §3.9 we saw what the Lie algebra of the vector fields {/,., 1,, 1,} 
was, equation (3.30). By defining V; =—/,, V. =— ly, and V3 =—J, we see 
that the Lie algebra of the vector fields {V;} is identical to that of SO(3), 
equation (3.64). This shows that our present definition of spherical symmetry 
implies the existence of a foliation of M into surfaces with the geometry of 
spheres. (Foliations were defined in 53.7.) 

Suppose we concentrate now on functions defined on the two-sphere S*. Any 
function on M defines such a function on any of its spheres of symmetry. We 
define the space of functions 1, (53) to be the Hilbert space of all complex-valued 
functions on S* which are square-integrable: the norm 


1/2 
Thal -|f wu sin ϐ d0 a4 (3.81) 


exists, where the integral is over the usual area element of the sphere. (Our defi- 
nition of this space is a little sloppy, but accurate enough for our purposes here.) 
The space L?(S”) is a vector space of infinite dimension. Its elements are func- 
tions, linear combinations of which are made with constants, and no finite 
number of functions is a basis. The realization of an element g of SO(3) asa 


Fig. 3.15. The cylinder is axially symmetric, but the centers of its 
circles of symmetry are not in the manifold. 
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mapping R(g) of S? into itself causes any function f(x’) on the sphere to be 
mapped into another one, simply by being carried along by the mapping. There- 
fore R(g) can also be identified as a representation of SO(3) in the vector space 
L*(S*), an infinite-dimensional representation since 1, (53) is of infinite dimen- 
sion. The question arises whether there are finite-dimensional subspaces of 
L*(S*) which provide representations of SO(3). Such a subspace would have to 
be invariant under SO(3), in the sense that R(g)[ f | for any g in SO(3) and for 
any f in the subspace must also be in the subspace. Suppose such a subspace 


exists and {f,,i=1,...,N}isa basis for it. Then it is invariant if and only if 
for any numbers {a’} there exist {211 such that 

R(g)la’f] = οἱ. (3.82) 
Because the map is linear there is a relation 

bi = gia’, (3.83) 


which defines a matrix g'; corresponding to the element g of SO(3). This matrix 
is called the representation of g in the subspace. A representation of SO(3) in 
any vector space V is said to be irreducible if V contains no finite-dimensional 
subspaces invariant under SO(3). 

The construction of the irreducible representation of SO(3) in L?(S7) is 
treated in many books (see Gel’Fand, Minlos & Shapiro, 1963). All physicists 
know the basis functions of the irreducible subspaces as the spherical harmonics, 
Υμῃ. Rather than go through their construction, let us simply try to understand 
them in terms of the present discussion. The claim is this. Every irreducible sub- 
space of L?(S*) is characterized by an integer / > 0 and has dimension 2/ + 1. 
The functions {Y},,,m =—l,...,/}are basis functions for this subspace, called 
V,. Moreover, the union of all these bases for all / is a basis for L?(S”) itself, 
which means that the spherical harmonics are complete. Since any map R(g) of 
S? into itself is the exponentiation of a linear combination of the vectors {L.., ly, 
|}, V, is invariant under SO(3) if and only if it is invariant under J,, 1, and J,. 

A trivial example is 7 = 0, where the basis function Yoo = 1 has Lie derivatives 


L,.(Yoo) = ly (Yoo) = L(Yoo) = 0, 


all of which are certainly linearly dependent on Yo9- A better example is /= 1, 
where the three basis functions are 


42 112 41142 
Y,-. = |—]  siné e?? γιο [1] cos6,¥,, = |— sine? . 
δπ 4π δπ 


(3.84) 


Exercise 3.24 
(a) Show that if x,y,z are Cartesian coordinates of R*, then on the sphere 
S* given by x? + y? + z* = 1 the following hold 
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3 1/2 3 1/2 3 1/2 
Y,-1 =) (x —iyv), Yio (=) 2,411 (= (x + iy) 


(b) Construct all the derivatives 111 κ. ¢.g. 
L(Y, 1) = —iY; ο/(2)3, L(Y, 1) = iVia, (3.86) 
and show that the space V, is invariant under SO(3). 


Why is this particular basis for V; chosen? This is largely a matter of conveni- 
ence. It is convenient that the basis should consist of eigenfunctions of relevant 
operators, i.e. functions which satisfy 

Af = af (3.87) 
for some operator A and constant a. The spherical harmonics are chosen because 
they are eigenfunctions of both J, and {2 = (£3)? + (£7,)° + (£;,)’, which was 
defined in exercise 3.7. The following exercise shows that this is the best one can 
hope to do: one cannot find nontrivial eigenfunctions of any two of {ς, ly, ο}. 


Exercise 3.25 
Assume that a function fhas the properties 


L.(f) = af, lL, (f) = Bf 
for constants a and β. Show from the Lie bracket relations (3.30) that 
a=B=1,(f)=0. 


Incidentally, the completeness of the basis functions comes from the fact that il, 
and L? are commuting operators (cf. exercise 3.7) which are (or extend to) self- 
adjoint operators on L?($7). The spectral theorem of functional analysis (cf. 
Riesz & Sz.-Nagy, 1955) guarantees completeness of their eigenfunctions. 
Actually, the representations of SO(3) may be studied much more abstract- 
ly than is apparent in the above discussion. In particular one does not need to 
say what the vector space V is in order to develop most of the algebra. For 
example, our original representation of SO(3) as matrices transforming vectors 
of R® is certainly irreducible, since no subspace of R* except the trivial one {0} 
is left invariant by all rotations. It turns out to be formally identical to the repre- 
sentation / = 1 of the spherical harmonics, which also has dimension three 
(= 21+ 1). In fact, equation (3.85) is simply a coordinate transformation of R? 
from (x,y, 2) to (Y, -;, Yio, Y1,). The transformation involves complex 
numbers, but if these are just treated algebraically then the matrices ο, of (3.83) 
expressed on the spherical-harmonic basis may be transformed into matrices 
expressed on the usual Cartesian basis, and these matrices turn out to be nothing 
more than the matrices we used to define SO(3) in the first place. 
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Exercise 3.26 

Let {y! j= = 1,2,3}stand for the functions (Y, -1, Yio, Y, 1). and let 
{x7} stand for te y, 2}. Find the transformation matrix Ai, = dy! /ax® 
and its inverse ΛΙ, ;'. Find the matrix xt ,' of the operator J, on the 
spherical harmonic basis 

L(yi) = Xi py®, 

by the methods of exercise 3.24(b). Transform X J be to the Cartesian 
basis 

Xi, = My A", X'y, 

and show that it is just L,; of equation (3.63). 


Notice that / = 1 is the smallest faithful representation of SO(3): 1 = 0 is not 
faithful. This is usually called the fundamental representation of SO(3). We will 
encounter another set of irreducible representations of SO(3) when we study 
vector spherical harmonics in §4.28. There the representation space will not be 
that of functions on the sphere but of vector fields on the sphere. 

Finally, we need to remark on the relation between representations of SO(3) 
and of its covering group SU(2). (This passage may be skipped by readers who 
have not studied §3.16.) Since there is a unique element of SO(3) associated 
with any one of SU(2), any reprensentation R(g) for elements g of SO(3) auto- 
matically defines a representation S of SU(2): for any u in SU(2) the transforma- 
tion S(u) is R(n(u)). If wu and u’ both correspond to the same element of SO(3), 
then S(u) = S(u’) for such representations. But SU(2) will also have other repre- 
sentations, say T, for which T(u) # Τι) even when π(ιι) = πί(ι). These are 
sometimes called double-valued representations of SO(3). Again we shall merely 
quote the result: the irreducible representations of SU(2) are characterized by an 
index k 2 0 which is either an integer or half an odd integer. Those for which k 
is an integer are representations of SO(3) for the same index (i.e. k = 1). The 
others are only double-valued representations of SO(3). An example of the latter 
is provided by the matrix representation we used to define SU(2), which is a 
representation in two complex dimensions. It has k = 3 and is called the spin-} 
representation. As with the 7 = 1 SO(3) representation, it is the smallest faithful 
one for SU(2). If we take any basis vector of the space (called a spinor) and 
operate on it by exp (Μι) as ¢ goes from 0 to Απ. we see that the corresponding 
path in SO(3), exp (11,1). goes from 0 to 2m twice. When the sequence of trans- 
formations reaches t = 27 we are back at the origin of SO(3), but are at —e in 
SU(2). For this reason it is said that the spinor changes sign (e > — e) if it is 
rotated once through an angle 27. 
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It is indeed remarkable that this correspondence between representations is 
not simply a mathematical game. The wave-function of a spin-} elementary 
particle is described by an element of an irreducible vector space of SU(2) for 
nonintegral k. This is one example of what a physicist might regard as the 
beautiful simplicity of nature. We begin with the Lie algebra of spherical sym- 
metry and we find that the group SU(2), not SO(3), is the simplest one having 
that algebra, in that it has the simplest global topology. We then find that, 
despite the difficulty of ‘visualizing’ the action of SU(2) in R*, nature has made 
the group more fundamental than SO(3) by providing particles which belong to 
those of its representations which are not representations of SO(3)! 
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4 DIFFERENTIAL FORMS 


The calculus of differential forms, developed in the early part of this century by 
E. Cartan, is one of the most useful and fruitful analytic techniques in differ- 
ential geometry. The catalogue of concepts that are unified and simplified by 
forms is astonishing: the theory of integration on manifolds, the cross-product, 
divergence, and curl of three-dimensional Euclidean geometry, determinants 

of matrices, orientability of manifolds, integrability conditions for systems of 
partial differential equations, Stokes’ theorem, Gauss’ theorem, and much more. 
As with most mathematical and physical ideas which are truly fundamental, the 
mathematics of forms is very simple. In this chapter, we introduce forms in the 
geometrical context in which they arise most naturally, and we then systemati- 
cally develop their power. 


A The algebra and integral calculus of forms 


4.1 Def:nition of volume — the geometrical role of differential forms 
Until now we have avoided giving our manifolds any shape or rigidity. 

We have mentioned the possibility of defining metric tensors, but we have con- 
centrated on those analytic tools which are definable without reference to any 
particular metric. Now we will turn to the study of a particularly useful class of 
tensors: those which can serve to define volume elements on manifolds. 

Consider the notion of volume in two dimensions, where it is called area. Any 
pair of (infinitesimal) vectors in Euclidean space defines an (infinitesimal) area, 
as in figure 4.1: the area enclosed by the parallelogram they define. Now, a given 
area is defined by many different pairs of vectors, which may differ from one 
another in length and enclosed angle, as in figure 4.2. The notion of area is, 
therefore, less restrictive than the notion of a metric: the Euclidean metric 
defines the lengths of vectors and their enclosed angle, while the specification 
of area gives only one number associated with the two vectors. Naturally, if a 
metric exists it should uniquely define the area, and we shall show how this 
comes about later. But it is possible to define an area for a two-manifold (or a 
volume on an arbitrary manifold) without having to define a metric on the 
manifold. Indeed, many different metrics could define the same volume. 
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Suppose that in a two-dimensional manifold we have at a point two linearly 
independent infinitesimal vectors, forming a two-dimensional parallelogram. We 
wish to define for this figure a (small) area, i.e. to associate with the two vectors 
a single number. This number ought to double if we double the length of one 
vector; moreover, we should require it to be additive under addition of vectors, 
1.6. _ κ 

area(@, ϱ) + ατεα(α, ϐ) = area(@,b+ 6). 
That this is true in Euclidean space is proved geometrically in figure 4.3. In the 
second-to-last step we have used the fact that the area of a parallelogram is 


Fig. 4.1. Two pairs of vectors and the area they define. 





Fig. 4.3. Geometrical proof that the area of a parallelogram is the value 
of a tensor. 
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area (a, b) = | ; area(a,c) =| | 


area (a, b + 7) 
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= area (da, b) + area (a,c). 
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unchanged if one of its sides is displaced an arbitrary amount along the straight 
line it defines. So we have proved that area( , ) isin fact a tensor, bilinear 

in its arguments. Since the area is a number, this is a (9) tensor. Moreover, if a 
and ὃ are parallel, the area must vanish. The following exercise shows that as a 
consequence the tensor must change sign if @ and b are exchanged. 


Exercise 4.1 

Prove that if B is a (8) tensor with the property that B(V, V) = 0 for 
all V, then B(U, W) = — B(W, U) for all U, W. (Hint: take V= U+ W.) 
We say that B is antisymmetric in its arguments. 


Consider this more closely. In figure 4.4 two vectors are drawn defining a 
parallelogram of a certain area. In terms of components, the area is (to within a 
sign) the determinant 
y~ y* 
w* Ww 
Its antisymmetry under interchange of V and W is manifest. 

In ordinary uses one forgets the sign and calls the area the absolute value of 
that determinant. It will be convenient for us to keep the sign, since it contains 
information about the left- or right-handedness of the pair of vectors. We shall 
discuss this in more detail below. We shall also develop in more detail the evident 
relation between volume-tensors and determinants of matrices. But first we must 
develop the algebra of antisymmetric tensors. At first we will concentrate on 
their properties at any point, generalizing to fields later on. 


area = 








4.2 Notation and definitions for antisymmetric tensors 
As in exercise 4.1 above, a (2) tensor is said to be antisymmetric if its 
value changes sign on interchange of its arguments: 


G(U,V) = — @(V, U) forall U, Ve & antisymmetric. (4.1) 
A tensor of type (), p 2 3, is said to be completely antisymmetric if it changes 


sign on interchange of any two of its arguments. Antisymmetric tensors can 
always be constructed from arbitrary ones. For example, if @ is a (9) tensor and 


Fig. 4.4. The area defined by V and W. 





Differential forms: algebra and integral calculus 116 


pa (3) tensor then their totally antisymmetric parts are the tensors whose values 
on arbitrary arguments are given by: 


+ (U,V) = Lad, V)— @(V,U)], (4.2) 
A 3,(0,V,W) = -_ (a0. 7.) + VW, 0) + 0,7) 
—~ BV, U, W) — p(w, V, U) — p(U, W, γ)] . (4.3) 


The rule is to take every permutation of the arguments; odd permutations con- 
tribute minus signs and even ones plus signs. The factors 1/2! and 1/3! are the 
conventional normalization, which is appropriate to calling @, the antisym- 
metric part of @&. These considerations all have counterparts in index notation, 
obtained by letting the arbitrary vectors be basis vectors: 


. 1 . 

4 (Oa) = 21 (Wi; — Wj) = wry), (4.4) 
. 1 _ 

4 air = 31 (Dijn + Pini + Prij — Pjix μι δι) = Pury. (4.5) 


Here we have introduced the square bracket notation [i...k]| to denote a com- 
pletely antisymmetric set of indices, including the corresponding normalization 
factor. In what follows we will use a notation introduced above: a tilde (~ ) over 
a tensor’s name, e.g. f, denotes a completely antisymmetric tensor. The one- 
form, for which we use the same notation, is a ‘degenerate’ case of this, because 
it has only one argument. 


Exercise 4.2 

(a) Prove that if the components of a (%) tensor ῥ are antisymmetric under 
interchange of any two indices, then f is a completely antisymmetric 
tensor. 

(b) Suppose {A;;,} are the components of a completely antisymmetric 
tensor. Show that 
Αμ Αμ: 

(ο) Suppose that A is an antisymmetric (9) tensor and B an arbitrary (2) 
tensor. Show that 
A,BY = AjBM™), 
i.e. that the contraction of A with B involves only the antisymmetric 
part of B. 

(d) Suppose A is as in (ο) and B is a symmetric (6) tensor: B(®, 6) 
= B(G, ὤ) for all one-forms © and &. Show that 
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A,B = 0. (4.6) 


An important property of completely antisymmetric tensors is the following: 
on an n-dimensional vector space, a completely antisymmetric (ϱ) tensor (p <n) 
has at most 

n! 

p\(n—p)! 
independent components. To see this, note that any component is defined by 
choosing p different numbers from the set (1,..., 7). (They must be different, 
because the component vanishes if any two indices are equal, just as in exercise 
4.1.) The order in which the p numbers are chosen — their order as indices on 
the tensor — can at most affect the sign of the component, so all components 
whose indices are simply rearrangements of a given set of p numbers are known 
if any one of them is known. The number of independent components is there- 
fore the number of different sets of p numbers chosen from n numbers, which 
is the binomial coefficient given above. 


n 
Dp 


(4.7) 


Exercise 4.3 
Prove that if p >n all the components of a completely antisymmetric 
(ϱ) tensor on an n-dimensional vector space vanish. 


4.3 Differential forms 

A p-form (p 2 2) is defined to be a completely antisymmetric tensor 
of type ()). As before, a one-form is a (°) tensor. A scalar function is a zero- 
form. The number p is the degree of the form. 


Exercise 4.4 

Show that the set of all p-forms for fixed p is a vector space itself, 
under the addition operation defined in exercise 2.4. This is then a 
subspace of all (ϱ) tensors. What is its dimension? 


Just as (2) tensors could be made from (@) tensors using the operation ®, we 
define an operation a (called ‘wedge product’) for constructing two-forms from 
one-forms: If δ and g are one-forms then 

+ PAG = P®G-GOp (4.8) 
is their wedge product. There is no factor of 1/2! in front, by contrast with 
equation (4.2)! 
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Exercise 4.5 
Show that 6 a g is a two-form. Show that pf a p = 0. 


Exercise 4.6 
Let {é,,i=1,...,n} bea basis for a vector space and {ῶ)} its dual 
basis for one-forms. Show that {@/ , @" ,j,k=1,..., n} is a basis for 


the vector space of all two-forms. Hint: by considering explicitly the 
numbers a;; = &(@;, é;), where & is an arbitrary two-form, show that 


1 oo. 
9 ἃ --σιαμῶ λῶ). (4.9) 


Note carefully the factor of 1/2! in (4.9), which occurs because the sum on 
(i, 7) includes equal contributions from 63 © @! and @ ® 6. This factor 
appears here because we did not put it into the definition of &' a ὤ], equation 
(4.8), as some textbooks do. This is a matter of convention. 

The rule for wedge products extends naturally to three-forms: 

PA({TAF) = HAG ar 
= PraGanr=pOBG@F+gGgBr@pt+—... (4.10) 

using the same permutations and signs as in the previous paragraph. Notice that 
this expression and its generalization to higher numbers of one-forms permits 
one to define wedge products for arbitrary p- and q-forms, since by exercise 
4.6 any p-form can be written as a linear combination of wedge products of p 
one-forms (the basis one-forms). 

The set of all forms of arbitrary degree, equipped with the anticommutative 
multiplication a, is called a Grassmann algebra. 


Exercise 4.7 

Show that the sum of the dimensions of all the N-form spaces for 

p <nis 2”. (Hint: use the binomial theorem.) This is the dimension 
of the space which has the Grassman algebra. 


Exercise 4.8 
Show that if f is a one-form and g a two-form, then 
(δλδ = Pidin + Dini t+ Pedi 
= 3Ppidjr} 
More generally, show that if δ is a p-form and g a q-form, 
+ AQ). je = Ορ Pri. Ve...) - (4.11) 
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4.4 Manipulating differential forms 

The algebra of forms is fairly simple, but it can lead to difficulties 
keeping track of signs and factorials. In this and later sections the student should 
find that careful, patient reasoning is the best approach to proving any result. 
For instance, let us prove the commutation rule for forms. If δ is a p-form and 
ᾷ aq-form, then 
+ Dag = © αι) σᾷλρ. (4.12) 
To see this, first express f and ᾷ as sums over their components times wedge 
products of the form @'a...a @! and @* a ...a G! (p-factors and q-factors, 
respectively, in each wedge product). Now we shall show that (4.4) applies to 


each of the simple products 

(SIAL. AB)A(@EA... AG). 
By the associativity of the wedge product, the parentheses in this expression 
are unnecessary. Now, if any two factors are exchanged (e.g. @! with @*) the 
expression changes sign. To move @ through the g-factors @’ a... a G3! 
requires g such exchanges, so that 


k 


WA. AIAG A... AG = (1) GAL. AGFA... AG! 


A 6S, 
Doing this for each of the p-factors in @'a ... A @ gives [(— 12412 times the 
original, which proves (4.12). 

An operation we will find useful later is the contraction of a vector with a 
form. A p-form requires p vector arguments to give a real number. If it is 
supplied with one argument then it becomes a (p — 1)-form. To be definite we 
define 


+ ἂ(ξ) = a(é, > 9 9788 .), [α(2)1/ = αμ... εξ. (4.13) 
p — 1 empty slots 


as the (p — 1)-form obtained by contracting & with £. Note that putting £ into 
any slot other than the first would only affect the sign of &(£). To get a feeling 
for what this means, consider & = p λᾶ, where pf and q are one-forms: 


(Dag) = C@FI—-JT OPE) 

PE) F— GE) B. 

Thus, although ἕ is contracted with the first slot of B a J, the permutations 
implicit in the a-operation ensure that ἕ is contracted with each one-form in the 


wedge product. Similarly, for a product of p one-forms we find 
Pi , 


(Sn BaA.. AP)(E) = HOA... nO --δῶλ...λῶ" 


nl, (4.14) 
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From this and the generalization of (4.2) it follows that if & is a p-form, 
| i 

o— iF" 

This is, of course, implied directly by (4.13) and (4.9). Similarly, if @ is any form 

and Bis a p-form, then 

(Baa) (E) = BE)AG+ (— 1PBn GE). (4.16) 


Again this can be proved by looking at each component of B a @. 


ijk OIA. AO. (4.15) 


aE) = 


Exercise 4.9 
Prove (4.16). 


A widely used alternative notation for ἄ(ξ) is £|@. 
4.5 Restriction of forms 

An elementary but important concept is that of restricting a form to 
a subspace of the original vector space V. Since a p-form @ is a (ϱ) tensor, its 
domain is the set of all vectors in V (strictly, the domain is the product space 
VxVx...XV,p ‘copies’ of V). The restriction of & to a subspace W of V is 
the same p-form & whose domain is now restricted to vectors in W. We call this 
Gly: 

dlw(X,...,¥) = a¥,..., 7), 
where all of ¥,..., Y are in W. Thus, &y is defined only on W. Note that if the 
dimension m of W is less than p, the restriction &|y is necessarily zero (any p- 
form is zero on an m-dimensional space if p > m), and if p =m then &| yw has 
only one independent component. The operation of restricting a form is often 
called sectioning it, because the picture one has is of the vector subspace W 
being a plane passing through (sectioning) the series of surfaces that represent 
a form. A form is said to be annulled by a vector subspace if its restriction to it 
vanishes. 


4.6 Fields of forms 

As with any tensor, a field of p-forms on a manifold Μ is a rule (with 
appropriate differentiability conditions) giving a p-form at each point of M. 
Then all our remarks up to now apply to forms as functions on the space ΤΡ at 
any point P of M. Only one point needs to be made: since a submanifold S of M 
picks out a subspace Vp of the manifold’s tangent space ΤΡ at every point P of S, 
we define the restriction of a p-form field & to S to be that field formed by re- 
stricting @ at P to Vp. We have seen an example of this for a one-form in $3.6. 
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4.7 Handedness and orientability 

In an n-dimensional manifold there is only a one-dimensional space of 
n-forms at any point (equation 4.7). Choose some n-form field, and call it ©. 
Consider a vector basis {é,,..., @,} at a point P. Since these are linearly inde- 
pendent vectors, the number (@é,,..., @,) is nonzero if and only if @ #0 at 
P. Therefore 6 separates the set of all vector bases at P into two classes, those 
for which @(é,,..., @,) is positive and those for which it is negative. These 
classes are in fact independent of ὤ. For, if ὤ' is any other n-form nonzero at 
P, then there exists a number f# 0 such that @’ = f@&. Any two bases which 
made © positive will give @’ the same sign (positive if f > 0, negative if f< 0) 
and so will again be in the same class. So all bases at a point can be put into one 
of two classes: right-handed and left-handed. (Which class has which name is, 
of course, a convention; what is important is that the classes themselves are 
distinct.) A manifold is said to be (internally) orientable if it is possible to define 
handedness consistently (i.e. continuously) over the entire manifold, in the sense 
that it is possible to define a continuous vector basis {@,;(P),..., @,(P)} whose 
handedness is the same everywhere. Clearly this is equivalent to being able to 
define an n-form which is continuous and nonzero everywhere. Euclidean space 
is orientable; the MObius band is not. 


4.8 Volumes and integration on oriented manifolds 

We return now to our view that forms are related to volume-elements. 
In an n-dimensional manifold, a set of ή linearly independent (‘infinitesimal’ ) 
vectors define a region of nonzero volume, an n-dimensional parallelepiped. The 
volume of this region is then the value of an n-form. One is free to choose any 
n-form as the volume n-form; which one chooses will be determined by the 
particular problem one is solving. 

Now, integration of a function on a manifold involves essentially multiplying 
the value of the function by the volume of a small coordinate element and then 
adding up all such values. Following our discussion of volume-forms, we shall 
introduce a useful notation for this. Suppose @ is an n-form on a region U 
of an n-dimensional manifold M whose coordinates are {x',...,x”}. Then 
because all m-forms at a point form a one-dimensional vector space, there exists 
some f(x',..., x’) such that 

6 = fdx' an... a dx". 
To integrate over the U, we divide it up into tiny regions (‘cells’) spanned by 
n-tuples of vectors {Ax! 0/dx1, Ax? 0/dx?,..., Ax” 0/dx”}, where the {Ax’? 
are very small numbers. The integral of the function f over one small cell is 


approximately the value of f times the product 
Ax! Ax?... Ax” = dxta...adx™(Ax! a/dx!,..., Ax” δ/9χ3). 


Differential forms: algebra and integral calculus 122 

Thus, we have 
| fel, ...,x")d"x & G(cell). (4.17) 

cell 


Adding up all the contributions from the different cells and taking the limit as 
the size of each cell goes to zero gives what we call the integral of & over U: 


9 Επ... (4.18) 


where the integral on the right is the ordinary integral of calculus and the 
integral on the left is our new notation. Since the version on the left does not 
mention coordinates, we must prove that it really does not depend on the co- 
ordinates chosen for U. We shall restrict our proof to two dimensions, since the 
generalization will be obvious. Consider first coordinates A and uw. Then we have 


ik = [200 μ) dad =| FO, μ) dr du. 


When we change to coordinates x and y, the chain rule gives 


~ ~ δλ». ολ. 
dx = ἅλα, ν) = --ἄχ +— dy, 
Ox oy 
~ ὂμ» , ON~ 
du = “dk + ἄν, 
Ox oy 


which follows from the definition of ἆλ as a gradient. So we get (remember 
dx a dx = 0 since it is antisymmetric) 


woo OAX~ , Ox ὃμ- , Or 
Dade = (eet AH) (Hes Hey 
Ox oy Ox oy 
= HE tH ἄνλᾶν 
Ox Oy we Oy Ox *” 
OXNOu ὀΌλομὶ- - 
[τπτ τπτ dx a dy. 4.19 
ο. 4 (4.19) 


The factor in front of dx Λ dy is the Jacobian of the coordinate transformation, 
O(A, µ)/(α. γ). From ordinary calculus we know that is how volume elements 
do in fact transform. Therefore, the (A, u)-integral of fis related to the (x, y)- 
integral of fin exactly the right way. 

But the value of f @ is not quite independent of the coordinates originally 
chosen. What we have shown is that a coordinate transformation does not 
change its value, but there is an ambiguity of sign in equation (4.17). This 
equation provided the original definition of f ὤ, and this definition would have 
given us the opposite sign had our original coordinate system had a basis of the 
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opposite handedness to the one we chose. The right-hand side of (4.17) would 
have been the same — the form is basis-independent — but on the left-hand 

side f would have changed sign. (This change is not the sort of coordinate trans- 
formation we discussed above, in which d”x would be multiplied by a negative 
Jacobian and all would be well. It is a change in the original identification of 
the symbol f @ with an integral from the calculus.) This ambiguity cannot be 
avoided. It is conventional to choose an orientation in U — i.e. define which of 
the two sets of bases is right-handed — and to use a right-handed coordinate 
system in the definition (4.17). We therefore find that the integral of @ over 
the region U is independent of everything except orientation. 

It was important in this argument that U be covered by a single coordinate 
system. Can we extend this integral to all of M, which may not have a global 
coordinate system? It is clear that if two coordinate patches have a single 
connected overlap region, the orientation chosen on one induces a unique 
orientation on the other, and the integral over the union of the two regions is 
well-defined. Clearly this can be extended to M as a whole if and only if Μ is 
orientable. From now on we shall restrict ourselves to integration on orientable 
manifolds, but it must be mentioned that the theory has been extended to non- 
orientable manifolds by de Rham, and this can have interesting physical appli- 
cations (see the paper by Sorkin (1977) in the bibliography). 

Integration as we have defined it is always done over forms of the maximum 
degree: n-forms on n-dimensional manifolds. One can of course integrate a 
p-form over a p-dimensional submanifold, provided the submanifold is itself 
internally orientable. How is the internal orientation of a submanifold S related 
to that of M? Suppose Μ is orientable and P is a point of ο. Given an n-form 
ὦὤ defined as ‘right-handed’ (or ‘positively oriented’) at P, is there a unique 
‘induced’ orientation for p-forms of S at P? Unfortunately not, because & on 
its own does not do anything for S, as its restriction to S is zero since p <n. 
What is usually done is to reduce ὢ from an n-form to a p-form by defining 
n — p linearly independent ‘normal vectors’ at P not tangent to S and defining 
the restriction of the p-form 


ο.” 


ὤ(Πι,.... An-p) 


to S to be right-handed. This definition clearly depends on the choice of the 
vectors {f;}, including the order in which they are numbered. Such a choice is 
called choosing an external orientation for S at P. We shall give an example of 
this in our proof of Stokes’ theorem below. If it is possible to define the external 
orientation {#;,i=1,...,—p} continuously over all of S (and ‘continuously’ 
means keeping the 7; linearly independent and not tangent to ) then S is said 

to be externally orientable. 
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It is clear that if some open region of M containing S is orientable then either 
S is both internally and externally orientable or it is neither, and that if no such 
region of M is orientable S may be one but not both. For example, consider a 
Mobius strip as a two-dimensional submanifold of R? (figure 4.5) and a curve in 
the strip as a one-dimensional submanifold of the strip (figure 4.6). Set up a 
right-handed triad of vectors at any point P of the strip, two lying in the strip 
and one out of it. Carry them continuously once around the strip, keeping the 
two always tangent to it. The outward pointing one always returns pointing to 
the opposite side: the Mébius band is not externally orientable in R?. Similarly, 
set up two vectors in the strip, one tangent to the curve @, and the other not. 
Transport these continuously around and the outward pointing one returns 
pointing to the other side of the curve in the strip. Since we know that the curve 
is internally orientable (this is a property independent of any space it is em- 
bedded in) it cannot be externally orientable in a larger nonorientable manifold. 


Fig. 4.5. The Mobius band in R®. It is easiest to imagine it made of 
rubber, lying flat on the page except near the top of the figure, where 
the single twist is. A triad at P, carried clockwise around (dashed path) 
returns in a way which cannot be continuously deformed into its 
original while keeping vectors | and 2 in the band and all three linearly 


independent. 


Fig. 4.6. Curves in the Mobius band. Curve @, is not externally orien- 
ted: vectors 1 and 2 begin at P and are transported as in figure 4.5. 
They return in a way which cannot be continuously deformed into 
the original while keeping 1 tangent to the curve and both linearly 
independent. But curve @, is externally orientable because it has a 
neighborhood (dotted line) in which a consistent choice of orientation 
is possible. 
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By contrast, the curve 62 is both internally and externally orientable in the 
strip because it does not ‘feel’ the nonorientability of the strip: it has a neigh- 
borhood in the strip which is orientable. 


4.9 N-vectors, duals, and the symbol εη _, 

We have so far considered completely antisymmetric (9) tensors, but 
of course the Grassmann algebra can be constructed for (0) tensors in a parallel 
fashion. A completely antisymmetric (0) tensor is called a N-vector. As for 
forms, the vector space of all p-vectors at any point in an n-dimensional mani- 
fold is of dimension CP. 

Notice that at a point there are four spaces which have equal dimension: the 
vector spaces of p-forms, (n — p}-forms, p-vectors, and (nm — p}vectors all have 
dimension Cp = 6η. ρ. Under certain circumstances one can find a 1-1 mapping 
between various of these spaces. We saw in §2.29 that a metric tensor gives a 
1-1 map from (ϱ) tensors to (§) tensors. It is not hard to see that this map pre- 
serves antisymmetry, so that it maps p-forms into p-vectors invertibly. Whether 
or not a metric is defined, however, a volume n-form @ (i.e. an n-form which is 
nowhere zero) provides a mapping between p-forms and (η — p}vectors. This 
map is called the dual map, and we now show how to construct it. (Do not con- 
fuse this map, which depends on & and maps a single (ϱ) tensor into a unique 
("9 ?) tensor and viceversa, with the concept of a dual basis for one-forms dis- 
cussed in chapter 2, which does not involve & and maps a set of n (4) tensors 
into a unique set of n (3) tensors and vice versa.) 

A given g-vector T with components T*-* = ΤΗ: (ᾳ indices) defines a 
tensor A by the equation 


1 ; 
¢ Aj. = ο. (4.20) 
Symbolically, we write 
A = 6(T) 
or simply 
4 A = *T. (4.21) 


We say that A is the dual of T with respect to ὤ. From (4.20) and the anti- 
symmetry of ww; 1 under interchange of any two indices it is clear that Aisa 
completely antisymmetric tensor of degree n — g (which is the number of indices 
on @ left over after contracting with the g indices on T). That is, Ais an (n—q} 
form. This map defines a unique (n — q}-form from any q-vector. We will show 
that it is invertible below, but first we will show that one has already become 
familiar with this map in the form of the cross-product in three-dimensional 
Euclidean vector algebra. 
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To understand this, recall that in Euclidean space one usually does not 
distinguish between vectors and one-forms: in Cartesian coordinates the com- 
ponents of a vector and of its associated one-form are equal. So consider two 
vectors U and V and their one-forms U and V. The two-form U a V has C3 =3 
independent components, which are U;V, —U,V,,U,V3 — U3V,, U2V3 
— U3V,.The vector U x V has the same components, and it is easy to show that 


*“UxV) = UanV (three dimensions). (4.22) 


Exercise 4.10 
Prove (4.22) by using (4.20). 


This iluminates a number of odd things about the cross-product: why it exists 
at all, why it does not exist in other than three-dimensions (only in three dimen- 
sions does the dual map take vectors into two-forms), and why U x V is an 
‘axial’ vector. This last fact comes from the fact that it is conventional to define 
ὦ in Euclidean space to give a positive volume to the basis (€, , 61, @3). If the 
handedness of the basis is changed, then so is the sign of @ and, consequently, 
the sign of U x V (which depends on the sign of @ in that it must be mapped by 
G into Un V, whose sign does not change). Under an inversion of coordinates, 
then, the conventional cross-product changes sign. 

The map between T and “T is invertible because they each have the same 
number of components. (Said another way, contracting T with ὤ in (4.20) loses 
no information from T because T is already antisymmetric on all its indices.) 
That is, for a given p-form A there is a unique (n — p}-vector Τ for which 
A =*T. This can be formalized by defining an n-vector ω'"''", the inverse of 
a3, by the equation 





+ we Fo, ν = ΠΙ. (4.23) 
The factor of n! is used because the sum in (4.23) has n! equal terms: w!?3"--” 
Χ W493...n = W701 ῃ =.... Then! factor assures the normalization 
] 
(123-9 — ; (4.24) 
© 123...n 
Then we say that 5 is the dual of B with respect to ®, 
4 S = *B, (4.25) 
if (for B a p-form) 
; 1 . 
4 ςἰ-..Ε — 7 Gght™ Εν. (4.26) 


To illustrate the inverse property of the two dual relations, let us look first at 
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scalar functions. The function /, viewed as a zero-vector, has the n-form dual 
1ῶ. This n-form has the dual zero-vector 


1 
*(f®) = πως ως.) =f. 


Thus we have proved **f= f. 
The general relation of this sort is found as follows. Start with a p-form B 
and define the (ή — p)-vector 5 by (4.26). We take the dual of 5: 


1 
CS) = (n—p)! ωι. κ). 


1 
= p\(n—p)! ων). ως se FB. ς 
(— 1212-Ρ) 
= pi(n—p)! Wj nj... Jor” SB, g. 


To get the last line one has to move each of the (n — p) indicesi...k ‘through’ 
(by permutation) all of the p indicesr...s, giving (n — p) factors of (— 1)’. 
Now, fix the indices (7... ἢ) at, say, (1 ...p). (Their names clearly cannot 
matter.) Then in the sum (for fixed (r.. . s)) 


Wi μι. ρω 
the indicesi...k must be chosen from the set (p + 1,..., 7). There will 
therefore be at most (ή — p)! nonzero terms in this sum, and each such term 
will be equal to every other one, just as in (4.23). So we have 


.RY...8 


i...Rr...g — __ ptl...nr...s 
Wi...k1...p~ — (n ϱ)' Wp+1...n1...pW . 


Moreover, this will be zero unless (7... s) is a permutation of (1 ...p), for 
otherwise the second w will have repeated indices. In the sum over (7... 8), 


pti...nr...s 
ω By. .s> 


there are thus at most p! nonzero terms, and again each of them equals every 
other. So we have 
ωλ ARNT SB — Ρἱ ωδ 11 ΣΤΡ ρ. 
Combining all these results gives 
(5). ρ = (— 1) 0 Con at πι...ρ ωδ thle P Bi ° 
But, from (4.24) we see that 
ώρκ1.. πι. ρω ο ο ΝΕ = 1, 
and so 
(*S)1...p τσ ΟΤΙ ρ. 
Since the labels 1... p could have stood for any indices, we have proved that 
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+ *#B = (-- 1)ΡΦ:ΡΡ, (4.27a) 
Similarly, had we started with a qg-vector T, we would have found 
+ ET = (- ασ OT, (4.27b) 


Notice that if 7 is odd, the factor (— 1? is always + 1. 

As mentioned before, a metric maps p-forms into p-vectors. Combined with 
the dual map this gives a map from p-forms to (ή — p} forms, or q-vectors to 
(n — q}vectors. This map is usually simply called * as well. But some caution 
regarding signs is necessary when the metric is indefinite (when, as in relativity, 
some lengths are positive and some negative). This is discussed in more detail 
in a later section, and an example of the use of this metric dual is given in 
exercise 5.13. 

In the algebra of forms it is often convenient to introduce the completely 
antisymmetric Levi—-Civita symbols 


+ 1lifi...k isan even permutation of 1,2,...,7; 
¢ εν. = ek =( —1ifij...k isan odd permutation of 1, 2,..., 2; 
0 otherwise. (4.28) 


For instance, the form dx! λ dx? ~ dx? on a three-dimensional manifold has 
components ¢;;, in the coordinate system (x', x”, x°*), but will have com- 
ponents he;;, in other coordinates, where h is some function. Suppose that a 
volume-form ὦ has components 


Wijk = Seize (4.29) 
where fis some function. Then its inverse is 
. | .. 
GF _— πω. (4.30) 


4.Ι0 Tensor densities 

We have taken the point of view that any nonzero n-form on an n- 
dimensional manifold defines a volume element. It sometimes happens that a 
given problem has two or three such n-forms. (An example of this is the flow 
of a perfect fluid, discussed in chapter 5. In the three-dimensional manifold 
of Euclidean space there are three physically defined three-forms: one whose 
integral gives the volume of a region, another the mass, and a third a conserved 
quantity related to the vorticity.) This makes it more convenient on occasion 
to relate all such forms to the coordinate-dependent n-form dx! λ dx? 
A...A dx", whose components are simply €;;_,. If @ is ann-form of interest, 
then the relation (4.29) rewritten as 


Wij...k = W Ei, ke 
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defines a quantity w which is called a scalar density. Although w is a function 
on the manifold, it is not a true scalar because it depends on the coordinates. 
Under a change of coordinates to xi = fix’), the components of & are multi- 
plied by the Jacobian J of the transformation (equation (4.19)) while ευ. κ is 
by definition unchanged. So w obeys the law 


w = Jw. 


This is the transformation law for a scalar density of weight 1. (The term 
‘weight’ is defined below.) It is possible to extend this to tensor densities. 
Suppose, for instance, that T is a (0) tensor on an n-dimensional manifold which 
is completely antisymmetric in its vector arguments: 

a = T’ tron: 


al 


n indices 


Then upon contraction with two one-forms & and B, T produces a volume-form 
r(&, B): 

{(ᾱ.β) -- T@,B; ,...,), 

te. = Ty .1048;. 

(Such tensors may arise in physics. For instance, the stress tensor mentioned in 
chapter 2 gives the density of stress when given two one-forms; the total stress 
is the integral of this over the volume, the integral of the contraction of the (2) 
tensor obtained by multiplying the stress tensor by the volume-form.) It is pos- 
sible to write the components of T as 


Τον | =2 Yen. 
which defines the numbers {πο}, which are components of a (@) tensor density. 


(It is conventional to use German letters to denote densities.) The transfor- 
mation law for such a density is 


xii = JAY ΣΑ, (4.31) 


where again J is the Jacobian (determinant of Ai). This is the transformation 
law for a (2) tensor density of weight 1. 

The term weight refers to the number of factors of J in the transformation 
law. For instance, a number w which transforms by 

w' = J*w 

is a scalar density of weight two. The generalization to tensor densities and to 
other weights is obvious. (An ordinary tensor is a density of weight zero.) The 
interpretation of densities of weights other than zero or one is more compli- 


cated, but such quantities do on occasion prove useful. In this book we shall 
not deal with densities, preferring to use the n-forms themselves. 


Differential forms: algebra and integral calculus 130 


4.11 Generalized Kronecker deltas 
The Levi—Civita symbol has many useful and interesting properties, 
some of which we explore in this and the next section. As we saw earlier, one 
often encounters products of es, such as εὖ "δε γι. It is possible to develop a 
systematic and convenient method of handling them. 
First we note that, in two dimensions, for any nonzero two-form 63 we have 
wyw = exe” = 5",5',— δ”,δ'. (4.32) 
The first equality follows from (4.29) and (4.30). To establish the second it is 
easiest simply to note that both sides are antisymmetric in (kK, ἢ and (i, 7), so it 
suffices to consider the case i#/,k #1. There is, up to a sign, only one such 
term, ει εἰ = 1. Clearly the right-hand side of (4.32) also gives one, which 
proves the result. A nearly identical chain of reasoning leads to the general 
result for ή dimensions: 


é;..pern” = 655, ...5", — 556", ...87, +... 


= n! 6';6,...8"%y- (4.33) 
There is an abbreviated notation for this. We define the p-delta symbol by 
+ biel = pl δἱμ.... 8 y, (4.34) 


where the sets (i... 7) and (kK... 1) each contain p indices. Then we have as a 
special case 


+ eye = OF. (4.35) 
The p-delta symbol can be obtained from the (p + 1)-delta symbol by con- 
traction, conventionally on the first indices. We begin with 

Simi..s = (P+ 1)! 8G md"... 5's. 
The terms can be arranged as: 

= p! 58) tm", ..-5'9) —p! 8 mb) 8", . .. 84] 


—p! 85 tm"; ...8'47 —...—p! 88 pmb", .. δη 
= p! {08 m5", ... δη — 8 md"... δρ] 
— m5", 2 8 gy me 8 *, 2. 8}, 
which gives 
9 δι τς = (1 --ϱ) διὰ (4.36) 


for the single contraction of a (p + 1)-delta in an n-dimensional space. 


Exercise 4.11 
(a) Justify each of the steps in the derivation of (4.36). 
(b) Obtain the p-delta from the n-delta by n — p contractions: 
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Ors kad = (np)! beh (4.37) 


n—p p 


As an example of the utility of this algebra, we shall calculate the triple cross- 
product in three-dimensional Euclidean space. In Cartesian coordinates the * 
operator uses €, SO 

(UXV); = ej,U'V®, 

and therefore 

[WxOxV)]; = eg,We"1,UV™ 
epi Η/Υ. 
Using (4.34) and (4.36) we have 

[Wx (Ux V)]; = (867, — 85" )WUVin 

= U,W-V)—V,(W- U). 

This derivation is so quick that it should make memorization of the triple cross- 
product formula completely unnecessary! 


4.12 Determinants ande ;;__x 

Consider a 2 x 2 matrix with elements A”. We shall show that 

det(A) = ει 1431. (4.38) 
To show this, write the sum on the right-hand side out explicitly: 

6A 11431 = €,AMA” + 6,447, 
where we have used the fact that €,; = ελ) = 0. Now, we also have that 
E12 = — €21 = 1, so we get 

Allg? — 412431 
which is the definition of the determinant of the matrix. The next exercise 
generalizes this to n X n matrixes. 


Exercise 4.12 

(a) Show that the determinant of an n x n matrix with elements A¥ 
@,j=1,...,n)is 

4 det(A) = ει ,ANA™... A”. (4.39) 

(Hint: the determinant of an ή x n matrix is defined in terms of (ή — 1) 
x (n — 1) determinants by the cofactor rule. Use that rule to prove 
(4.39) by induction from the 2 x 2 case.) 

(b) Show that 
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1 -- 
det(A) = τη] €ab...c€i...kAA™ / AM, 


Exercise 4.13 
If a manifold has a metric, let {ῶ1λ be an orthonormal basis for one- 
forms, and define @ to be the preferred volume-form 


LAG A...A CS”, 


O= @ 
Show that, if {FV is an arbitrary coordinate system, 

+ ὢ = lel? dx? Adx2 A... A dx” , (4.40) 
where g is the determinant of the matrix of components g;’;" of the 
metric tensor in these coordinates. 


Again it is interesting to look explicitly at the three-dimensional Euclidean case. 
The volume of a parallelepiped formed by the three vectors a, b, and ¢ is the 
determinant of the matrix whose rows are the components of those vectors. 
From (4.39), therefore, 

volume = €;jna'b'c® = a'(e;,b'c*) 
a'(b X@); = @-(6-0), 


another well-known expression for the volume. 


4.13 Metric volume elements 
In exercise 4.13 we used the metric of a manifold to define a certain 

orthonormal basis {¢3'}, from which we constructed an n-form ὢ (equation 
(4.40)) which we called ‘the preferred volume-form’. Does this form deserve 
the name ‘preferred’: is it unique, or does it depend upon the particular ortho- 
normal basis (which certainly is not unique) used to define it? The answer is 
that it is unique, apart from a sign. To see this, note that the components of 
ὦὤ on the original basis are, by definition, ¢;; ,. If (ῶ] } is any other ortho- 
normal basis, then the components of @ on this basis are Je;’;’,_,', where J is 
the Jacobian of the transformation from {@/} to {@/ η. But, because the two 
bases are orthonormal, this Jacobian is + 1 (proved below). Therefore, the 
form @ differs from the ‘preferred’ form defined by {ῶ] } by at most a sign. 
If we adopt a convention for handedness, we can define @ by right-handed 
orthonormal bases, and it is unique. So a metric defines a unique volume-form 
for an oriented manifold. On intuitive grounds, of course, this is not at all 
surprising. 

To prove this result we used the fact that the Jacobian of a transformation 
from one orthonormal basis to another — which is just the detérminant of 
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the transformation matrix Ai, — has absolute value one. This is not hard to 
establish. We start from the general transformation law for the metric tensor’s 
components 

ει = Ny Ny 8p1, 
which can be written in matrix language as (cf. $2.29) 

(g') = (A)*@) (A). 
The determinant of this transformation law for g;; gives 

det(g’) = det(g) [det(A)]?. 
But in an orthonormal basis, g;; is a matrix which has + 1 on the diagonal and 
zero elsewhere. (Recall that if g;; is an indefinite metric not all diagonal elements 
will have the same sign.) So the determinant of g;; is + 1, and has the same sign 
in all orthonormal bases. Therefore we have for the Jacobian 

det(A) = J = £1. 


For indefinite metrics, the dual operation * can be defined in either of two 
ways, which arise because w--”, the inverse of the volume-form, has two 
‘natural’ definitions which may differ by a sign. The point of view we took 
earlier was that 


i.e. that 
ol" = (Wr nd, 


But if there is a metric one might like to define an n-vector ὢ by raising the 
indices of 63: 


ὤ λε κ = gilgim — gkroy κ. 
From equations (4.39) and (4.40) it follows that 
Gy te F _— |σ] 1/2 det(g!””) ele 


Now, since (g’”") is the matrix inverse to δη. its determinant is g ‘and we 
have 


απ. 
(ῶ }12:::1 — (4.41) 
δ 
whereas we had 
(0123 —_ ~ I —_— _ ; (4.42) 
(ῶ)ι. η 5 


If g is negative, these differ by a sign. It is conventional in relativity, where g is 
negative, to use @’ in the inverse-dual relations. This introduces an extra minus 
sign into equations like (4.27). 
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B The differential calculus of forms and its applications 


Where there is an integral calculus there is also a differential calculus, and so we 
shall introduce the exterior derivative, which operates on forms and produces 
forms which are their derivatives. The exact sense in which exterior differ- 
entiation is the inverse of integration is shown in Stokes’ theorem, proved below, 
which is the generalization of the fundamental theorem of calculus, 


b 
|, of = 10)-1@. (4.43) 


We will then go on to show the close relationship between differential forms and 
partial differential equations. 


4.14 The exterior derivative 

We want to define a derivative operator on forms which preserves their 
character as forms and which is inverse to integration, in the sense of (4.43) 
above. Note that if M is a one-dimensional manifold, the operator d which takes 
a zero-form f to a one-form df does indeed satisfy (4.43) above. So what we 
want is to extend d to forms of higher degree. By analogy with the operation of 
d on zero-forms, it must raise the degree of a form. Thus, if @ is a p-form, then da 
is to be a (p + 1)-form. The appropriate way to extend d is as follows (where 
& is a p-form and β, ¥ are q-forms).: 
# (i) d@+¥) = (dB) + 7) 
@ (ii) ἁ(ᾶλβ) = dOanp+CE1Pan dB 
(iii) ἁ(άα) = 
Property (ii) is just the Leibniz rule apart from the (— 1)”, which comes about 
because one has to bring the operator d ‘through’ the p-form & in order to get at 
β, and this involves ‘exchanging’ it with p one-forms, each exchange contributing 
a factor of — 1. This property guarantees that d will preserve the rule (4.12). (A 
derivative with the property (ii) is called antiderivation.) Property (iii) is at first 
sight surprising, but on examination for the case where & is a function f proves 
sensible: the one-form df has components 0f/dx?. A second derivative would 
have components that were linear combinations of 07f/0x/0x'. But to be a two- 
form this second derivative would have to be antisymmetric in i and /, whereas 
07f/8x'dx! is symmetric (partial derivatives commute). Therefore it is sensible 
that it vanishes. The properties (i)-(iii) plus the definition of d on functions 
uniquely determine d. (This is a theorem whose rather long proof may be found 
in any of the standard references.) 
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Exercise 4.14 
(a) Show that 


d(fdg) = dfa dg. (4.44) 
(b) Use (a) to show that if 


1 ~ ~ 
&@ = —a, jdx'n...ndx’ 


p! 
is the expression for the p-form @in a coordinate basis, then 
v . 19 Vk iyi dyed 
άᾶ = — >> (q@;..)dx* adx'an... ade’, 
p! ox 


and hence that 


- ὃ 
9 (άᾶ). 1 = (Pt) Ὀχῖε ο." (4.45) 


4.15 Notation for derivatives 

We shall have frequent occasion to use partial derivatives from now on. 
There is a standard and convient notation that for any function f on the mani- 
fold 


Of _ 

πα. Γι. (4.46) 
Notice that f might itself be the component of a tensor, in which case the 
comma follows all other indices: 


ον’ 
ak = Vien: (4.47) 


Second derivatives are denoted by more indices after the comma, but conven- 
tionally no extra commas are used: 
o7f 
ax® axt = Fir: (4.48) 
The indices are to be read left-to-right to find the order in which the derivatives 
are applied (the opposite to the 0/dx” convention). Note carefully that partial 
differentiation is not an allowed tensor operation on components as discussed in 
§2.27. That is, the functions fyi, μὲ do not in general equal the functions 
. b’ / / 
NA jA° k γα b’, ec? 
which are obtained by transforming the partial derivatives from another set of 
coordinates. (Recall the discussion in §3.4 of the problems involved in defining 
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differentiation of tensors on a manifold.) An exception to this rule is differ- 
entiation of a scalar function, where we have seen that { ; is the component 

of the one-form df. (Here it is worth recalling the distinction between scalar and 
function drawn in §2.28.) An example of our new notation is afforded by the 
Lie bracket: 


[U,V}i = UV' ,-viU' ;. 
Although each term on the right separately does not transform as a tensor, 


together they do. Similarly, the partial derivatives in (4.45) appear in a com- 
bination which also transforms as a tensor. 


Exercise 4.15 

Show that ντι does not transform as a tensor under a general co- 
ordinate transformation, and then show that [U, V]' does transform 
as a vector. 


With the convention that the derivative index is placed after all the others, 
(4.45) becomes 


(d&); iz = C1P@t 1) os. αμ]. (4.49) 


4.16 Familiar examples of exterior differentiation 
Just as the wedge product gave us the cross-product in three dimen- 
sions, so the exterior derivative (‘wedge-derivative’) gives us the curl. Consider a 
vector ἄ. The exterior derivative of its associated one-form is 
da = da, dx! +a,dx? + αἲ ἀχ3) 
= a, ;dx/ α dx! + a, ;dx? a dx? + a3; dx! a dx?, 


ο. 


Since dx! α dx! = 0 and similarly for indices 2 and 3, this becomes 
da = (αι 2 — az 1) dx? A dx! + (a2 3 — a3 2) dx? A dx? 
+ (a3 ι —a, 3) dx} A dx3, 


The curl is clearly involved here. To isolate it as a vector, we take the dual: 


*dz = (αι 2 — a1) *(dx? Adx!)+... 
ὃ 
= G12 αλ) at... 
ὃ 
= — — Ht... 
(a2 4 41,2) 53 
*da = Ψχᾶ. (4.50) 


~~ 


So the curl operator in three dimensions is *d. 
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Not only the curl, but also the divergence comes from exterior differen- 
tiation. In this case the appropriate operator is d*. That is, start with a vector ἄ 
and take its dual: 


* 0 0 0 
(>) --- 1 + 2 + 3 
(a) - x} a ax? a ax 1 


ba’ ει, dx! n dx* +... 
= a'(dx? ndx®)+.... 


Then the exterior derivative of this is 


~ 


d*a = αἱ dx! n dx? κ dx? +. 

= a’ ,dx! a dx? Ade+... 
(ai ϱ) dx! » dx? a de?. (4.51) 
(In going from the first line to the second only j = 1 survives in the wedge pro- 
duct.) We have therefore shown that 
+ d*a = (V-a)6, (4.52) 
where @ = dx! κ dx? a dx? is the Euclidean volume-element in Cartesian co- 
ordinates. We shall generalize this divergence formula to arbitrary manifolds and 


arbitrary p-vectors in 54.23. 


Exercise 4.16 

Use (4.50), (4.52), and property (iii) of $4.14 to show that (in three- 
dimensional Euclidean vector calculus) the divergence of a curl and the 
curl of a gradient both vanish. 


4.17 Integrability conditions for partial differential equations 

Exterior differentiation, like forms themselves, is closely related to 
familiar concepts from calculus. As an example, consider the system of partial 
differential equations 


of of 
κ = 8@,y), το = Παιν). (4.53) 
Ox oy 
By letting (x, y) be coordinates of a manifold, this can be written as 
Γ. i ~ ᾱ. 


where a, = g anda, =h. This equation, in turn, has the coordinate-independent 
form 
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df = @, (4.54) 
where @ is a one-form with components g and h. Now, if f is a solution to this 
equation then we get a valid equation by operating with d upon it: 

d(df) = dz. 

But the left-hand side vanishes by property (iii) of the definition of d, so we 
have that a necessary condition for the solution to exist is that 


ω 4 


da = 0. 


In component language this is 


a,j) = 9, 
which is really only one equation (a two-form on a two-dimensional manifold): 
ὃ ὃ og ὃ 
κ δν _ 5498 Oh _ 4 (4.55) 
Oy Ox Oy Ox 


These are, of course, the integrability conditions for the equations. Thus, the 
exterior calculus gives a geometric derivation of these conditions, and it is 
usually the easiest way to derive them because of the conciseness of its notation. 
The fact that the integrability conditions are sufficient conditions for the exist- 
ence of a solution is assured by Frobenius’ theorem, in the version described 

in §4.26. 


4.18 Exact forms 

By definition of the exterior derivative d, the statement & = df implies 
d& = 0. It is natural to ask for the converse: if da = O, do we know there exists 
a β such that & = dg? A form & for which d& = 0 is said to be closed; a form & 
for which & = dg is said to be exact. Is a closed form exact? In the next section 
we will prove that the answer is yes in the following sense. Consider a neighbor- 
hood JY of a point P, in which & is everywhere defined and in which da = 0. 
Then there exists a sufficiently small neighborhood of P in which a form 8 is 
everywhere defined and for which & = df. Clearly, B is not unique: ᾖ + d¥ for 
any ¥ (of the right degree) also works. 

We only claim that a closed form is exact locally, because the statement is not 
always true globally. Given an arbitrary region G of a manifold in which @ is 
defined and closed, it may not be possible to find a single 8 defined everywhere 
in D for which & = df. 

We give the following example in R*. In figure 4.7, consider the annulus en- 
closed between the curves @, and 62, and consider Cartesian coordinates x and 
y, whose origin P is inside 62. The one-form 

. xdy —ydx 
— x Hy? 
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is defined everywhere between the curves and has the property d& = 0, as one 
can easily verify. Is there a function f such that & = df? If we introduce the 
usual polar coordinates r and @, then it is easy to see that & = dé, so we appar- 
ently have the answer ‘yes’. But there is a problem: ϐ is not a single-valued con- 
tinuous function everywhere in the region of interest, the region between @, 
and @,. Therefore, although @ is well-defined everywhere in this region, there is 
no function f such that & = df everywhere. The answer is ‘yes’ locally, but ‘no’ 
globally. This problem would go away if we ignored 4, and considered the 
whole interior of @; , since ἄ is not defined at x = y = 0. Similarly, if we con- 
sidered the region shown in figure 4.8, then again the problem goes away: in 
this case @ is defined everywhere inside @, and @ can be chosen single-valued and 
continuous inside @ as well. So in this simple example we have found that, 
whereas locally da = 0 > & = df, the global question (whether fis defined every- 
where) depends on the region being considered. 

It is clear that we are dealing with one aspect of the topology of a region or a 
manifold. The study of those topological properties which determine the rela- 
tion between closed and exact forms is called cohomology theory. After we have 
proved Stokes’ theorem we will have enough mathematical machinery to take at 
least a brief look at cohomology theory in 54.24. 


Fig. 4.7. An annular region of R*. The region does not include its 
boundaries. 





Fig. 4.8. A region of R? similar to that in figure 4.7 but whose bound- 
ary is a single connected curve. The discontinuity in ϐ (where ϐ jumps 
from 27 down to 0) on any circle r = const about P can be made to 
take place outside ¢. 
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4.19 Proof of the local exactness of closed forms 

We shall prove the following theorem, known as the Poincaré lemma. 
Let & be a closed p-form (d& = 0) defined everywhere in a region U of M, and 
let U have a 1—1 differentiable map onto the unit open ball of κ”, i.e. the 
interior of the sphere S”"! defined by (x!)? +(x7)? +...+(")? = 1. Then 
in U there is (p — 1)-form @ for which & = ἀβ. 

Before proving this let us see what this map is. Clearly it means that U is 
covered by a single topologically Cartesian coordinate system. This is really a 
topological condition on U: the region shown in figure 4.7 does not have such 
a coordinate system while that in figure 4.8 does, as illustrated in figure 4.9. 
Other kinds of regions also have such a map. For instance R” itself can be 
mapped onto its unit open ball by the equations 


2 ; arctan r 


x! > = x! ———., (4.56) 
π r 
r= (ο) +?) +... 5)”, (4.57) 
because these imply 
2 
r->— arctan r. (4.58) 
π 


This is a C™ map even at the origin, as one can see by expanding arctan r in its 
Taylor series 


arctanr =r—4rt+ir—t.... 


To prove the theorem we use the coordinates x’ in U and construct the form 
B we seek. Suppose @ is 


Fig. 4.9. A map from the region in figure 4.8 onto the unit open ball 
of R* (the interior of the unit circle). Dotted lines map to dotted lines, 
dashed to dashed, and a few typical points are shown. Clearly such a 
map can be made C if the boundary curve ¢isC. 
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@ =a, ,(x',...,x™dxia...n dx®, (4.59) 
where each component a;_», has p indices. Contract & with the ‘radial vector’ 
7 whose components at any point on the coordinate basis are (x',...,x”), and 
call this (p — 1)-form fi. It is, by equation (4.13), 

f= QF) = ay px'dxi an... adx®. (4.60) 
Now we define the functions 


1 
β να. κ... αχ) = [ tP a νέα) 5 tx? , wey tx”) x'dr. (4.61) 


This integral is along the radial line in the coordinate system which {x'} happens 
to lie on. The functions define a (p — 1)-form B 


B= By ndxin...adx®, 
and the claim is that ¢ = dg. 
The proof of this claim is straightforward algebra. From (4.45) we have 


wm 0 
(dB). = P μη Bj...) - (4.62) 


The derivative is easy: 


ὃ 


1 
πλ. = [ tP ays A (tx’ λ. 9.5 tx") dt 


1 
+ [ tPx'oy; κ (tx',..., ox”) dt. (4.63) 


In order to antisymmetrize this on [i ...k] we invoke (for the first time) the 
closure of @: 


O = απ. νη = αι]... μι] αι. ΕΠ απ)... i Ui...R], b> 
(4.64) 


where vertical bars separate out an index which is not included in the anti- 
symmetrization implied by | |. But the components of @ are already anti- 
symmetric on all indices, so the first p terms are equal, and we have 


O = pajrj...r,i) — fij...kI,1- (4.65) 
Putting this into the second integral of the antisymmetrized version of (4.63) and 
inserting this into (4.62) gives 


~~ Aw 1 _ 

(dB )ij...% =|. [ptP aj; 2(tx',..., x”) 

+ t?x'oy; κ (tx',..., x”)] de 
. d 1 n 

| Gp Lo iin κχ ,...,0x”)] dt 
0 


= Oyj p(X"... + 5X"). (4.66) 
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This proves the theorem. 


Exercise 4.17 
Prove equations (4.64) and (4.65). 


Exercise 4.18 

Use the local exactness theorem to show that locally (in three- 
dimensional Euclidean vector calculus) a curl-free vector field is a 
gradient and a divergence-free vector field is a curl. 


There are two cautionary observations to be made here. The first is, as noted 
in §4.18, the (p — 1)-form 8 we constructed is not the only one for which 
dg = &. The second is that we have merely given a sufficient condition for a 
closed form to be exact. Cohomology theory reveals many more complicated 
manifolds on which a closed form is still exact. (See $4.24.) 


4.20 Lie derivatives of forms 

We shall prove the following useful expression for the Lie derivative of 
a p-form & with respect to a vector field V: 
9 εφῶ = d[a(V)] + (de) (V). (4.67) 
That is, the p-form £7 is the sum of two p-forms; the first is the exterior 
derivative of &(V), the contraction of @ on V; the second is the contraction 
of d@ on V. The proof is rather long and may be omitted on a first reading. 
The result, (4.67), has a nice naturalness: £7@ is a p-form involving V and 6; 
if it can be constructed using d at all (which we should expect, since both 
derivatives involve only the differential structure of the manifold), then it must 
involve the only two p-forms which one can construct from V, ὤ, and d. In fact, 
it is just their sum. 

The proof proceeds by induction. We shall drop tildes over the symbols in the 
rest of this section, for the sake of clarity. 

The first part of the proof is the case where w is a zero-form, a function f. 
Then its contraction on V is by definition zero, while its exterior derivative is df. 
If V = d/dd, then we know that df(V) = df/da, but this is also equal to £7f 
= V(f) = df/dix. This proves the expression in the simplest case. 

The next case is w, a one-form. Then we use component notation: 


WV) = wV'> d[a(V)] = (ωμ) | pdx? 
da = d(w;dx') = (dw,) a dx! 
ww, dx! n dx? = «3; (dx? 6 dx! — dx! ® dx’) 
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> (dw)(V) = ww, ;[dx4(V) dx? — dx'(V) dx] 
= 0; Vi dx! — ww; V' dx’. 
These expressions combine to give 
4[ω(γ)] + dex(V) = [ωμή + oV4 ἡ be’, 
which is the same as £7w from equation (3.14). 


The rest of the proof proceeds by induction. Since a general p-form can be 
represented as a sum of functions times wedge products of p one-forms, as 


| 


w= pf Cota A oe Nn dx®, 
it suffices to prove the theorem for a form which can be written as 
w = fanb, (4.68) 


where we assume the theorem has been established for a and b. Then we have 
Low = (fLpfpanb+ f(£ya)nb + fan (£7d) 
= dfV)anb+ f{d[a(V)] + (da) (V)ta b 
+ fan {d[b(V)] + (db) (V)}. 


But we also know that (if α is a p-form) 


d[w(V)] = d[faV)ab+C 1)?fand(V)] 
= df[@@ab)(V)] + fd[aV)] 0b +G 1)? ta(V) κ db 
1 + (—1)?danb(V) Γαλ [db(V)]}, 
. (dw)(V) = [dfnanb+ fdanb + (— 1)?fan db] (1) 


dfVV)anb— dfn [an b)(V)| + f[daVV) a b 

+ (—1)P*! dan D(V) + (-- 1)?a(V) a db Γαλ db(V)]. 
Thus, adding these gives the same expression as for £7w above. This establishes 
the expression’s validity for general forms. 


4.21 Lie derivatives and exterior derivatives commute 
A very important consequence of (4.67) is the fact that Lie and 
exterior differentiation commute: (4.69) below. To prove this, note that for any 
form w (again omitting tildes for clarity), 
fydw = d[(dw)(V)I, 
since ddw = 0. But, by using the Lie derivative formula once more we have 
(dw)(V) = ὄνω-- ἀ[ω(γ)]. 
So that (again because dd = 0) we get 
+ ἁφ(άω) = d(Lpw). (4.69) 
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Lie differentiation and exterior differentiation commute! This is actually a 
special case of a more fundamental property of d which is established in more 
complete treatments of the subject, namely that there is a sense in which d 
commutes with any differentiable mapping of the manifold. (This commutation 
property may make it easier for you to prove the second-to-last part of exercise 
3.9!) 


4.22 Stokes’ theorem 

We are now in a position to show that exterior differentiation and 
integration are inverse to one another. Since integration of forms on an n- 
dimensional manifold is defined only for n-forms, the inverse property applies 
only to exterior derivatives of (n — 1)-forms. Moreover, since only definite inte- 
eration of n-forms is defined (i.e. the integral produces a number, not another 
form), the inverse relation analogous to equation (4.43) at the beginning of part 
B of this chapter will have to relate the integral of the n-form dé to another 
integral, that of ὤ. But the (n — 1)-form @ can be integrated only over (n — 1)- 
dimensional hypersurfaces, so we are led naturally to look for a theorem relating 
the integral of dé3 over a finite region to the integral of 6 over the region’s 
boundary, which is (n — 1)-dimensional. Our approach to this theorem (equation 
(4.75) below) will, however, be somewhat indirect, in order to avoid some of 
the lengthy calculations the usual proofs employ. We shall begin by looking at 
the change (to first order) in the value of an integral when the region of inte- 
gration is slightly changed. 

Accordingly, let us consider the integral of an n-form @ over a region U of 
an n-dimensional manifold Μ. Let U have a smooth orientable boundary called 
dU, by which we mean an orientable submanifold of Μ of dimension η — 1 
which divides M—dU into disjoint sets U and CU (the complement of U) in such 
a way that any continuous curve joining a point of U to a point of CU must 
contain a point of 0U. For simplicity we will assume 0U is connected, although 
this is not necessary. Examples are given in figure 4.10. Now let & be any vector 
field on M, and consider a change in the region of integration generated by Lie 
dragging the region (but not the form ὤ) along ἕ. Thus there is a family of 
regions U(e) and boundaries dU(e) obtained by moving along £ a parameter 
distance ε from the original ones, U = U(O) and 0U = 0U(O). This is illustrated 
in figure 4.11. 

The change in the integral of @ is simply the integral over 6U(e), the region 
between the boundaries: 


| ῶ-- | ῶ = | ὤ. (4.70) 
U(e) U(O) ὃ U(e) 
We will calculate this. Let V be a patch of dU covered by coordinates we shall call 
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{x7 x°,...,x}. By Lie dragging along £, we construct coordinates {x* = e, 
x?,x°,...,x”} for a neighborhood in M of any such patch V in which ἕ is 
not tangent to 90 (see figure 4.12). This defines and provides a coordinate 
system for the region 6 V(e) between 0U(0) and 0U(e) ‘above’ V = Υ{0). We 
will first calculate the integral of 4 over this region and then extend it to all of 
5 U(E). 

In our coordinates we write 

= fl,...,x")dxta...n dx”. 


If € is small, its integral is! 


Fig. 4.10. A manifold with a ‘handle’. Curves @, and @, are not 
boundaries since they do not divide M into an ‘inside’ and ‘outside’. 
The union, U @, is a boundary consisting of disconnected sub- 
manifolds. By contrast, @ 3 is a connected boundary. 





Fig. 4.11. The deformation of U = U(0) into U(e) by displacement a 
parameter distance € along the integral curves of ξ. Arrows represent 
the vector εξ (for small €). The region between 0U and 0U(e) is 6 U(e). 


0U(E) 





Ἐ The symbol ο(ε) stands for any function g(e) for which g(e)/e > 0 as e > 0. 
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| ῶ [ | [joe] eta 
6V(e) V(O) 0 


ε| f(O,x7,...,x")dx?... dx” + ο(ε) 
ν(ο) 


ε | ὤ(ξ) + ο(ε). (4.71) 

ν aU 
The last line follows from (4.13) and the fact that 9/9χ1 = &. 

Equation (4.71) is independent of the coordinate we constructed, but it does 
require that ἕ should not be tangent to dU in V. So it obviously applies to any 
region of 907 bounded by points where V is tangent to 0U. If these points form 
submanifolds of dU of lower dimensionality (as in figure 4.11), then they will 
not cause a problem. They will just divide dU into different regions V(;) in 
each of which (4.71) holds. If on the other hand ἕ is tangent to 0U in an open 
region in 0U then the Lie dragging simply maps that region into itself and does 
not change the integral of @ at all, so (4.71) still holds, both sides being zero. 
We can therefore apply (4.71) over all of 90 and combine it with (4.70) to get 

aU 


< { 3 = lim. + oe | = 33(E) 
de J U(e) κ. e> 0 ε U(e) “ {eo oan low κα 
(4.72) 


Now, we can obtain another expression for (d/de) f @ from the very con- 
struction of the Lie dragging of the region along £; at any new point the inte- 
grand differs from the old one by e£¢@ + ο(ε). We therefore have 


A £ | 3 =| fee (4.73) 
de Jue” Jue ** " 


But the expression (4.30) for £26 is particularly simple in this case, since dé) is 
an (nm + 1)-form and so vanishes identically: 








Fig. 4.12. A coordinate system for the neighborhood of a patch V of 
0 in which ἕ is never tangent to OU. The region 6 V(e) is all the points 
on those integral curves of ἕ that pass through V, and which are a para- 
meter distance Se from V. 
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[ εεῶ = | dte(é)]. 

U U 

Combining this with our previous expression for (d/de) fy @, we get the diver- 
gence theorem (the reason for whose name will become clear in the next section): 


+ ιο) = ο. (4.74) 


Now, since @ and & are arbitrary, @(£) is an arbitrary (nm — 1)-form. So we can 
rewrite this as Stokes’ theorem for an arbitrary (n — 1)-form & defined on M: 


+ | da = | a, 4.75 
oe au ἃ (4.75) 


where on the right-hand side we must of course restrict & to dU. 

That this is what one knows as Stokes’ theorem in Euclidean vector calculus 
is easily seen by letting M be two-dimensional as in figure 4.11. Then let & be a 
one-form, & = a,;dx!, da = (a; ; αι) dx! ® dx!. The restriction of & to OU 
means allowing it to operate only on/ = d/dA, a vector tangent to the curve OU. 
Then we get 


. dx! ; 
| (4,2 αι) de® dx? ~ > May OA ~ ϕ, αμάν. 
This is the usual Stokes’ theorem. 


4.23 Gauss’ theorem and the definition of divergence 

Stokes’ theorem also embodies what is usually known as Gauss’ 
theorem in vector calculus. For example, return to (4.74) and consider co- 
ordinates in which @ = dx!a...A dx”, in some region W of M. Then its con- 
traction with £ is 

; ὤ(ξ) = ελλ... nde” — £2 dxtn dx? .. dx® +..., (4.76) 
. 4 [ῶ(ε)] = Edyta dx?n...n dx” 
+ £7 sdet pn dx?ndxen...dx™ 4+... 
= #0. 

By analogy with Euclidean geometry we define the °a-divergence’ of a vector 
field &: 7 
9 (άϊνο  ἑ)ῶ = ἀ[ῶ(ξ)Ι. (4.77) 
If in the patch V of 0U defined in §4.22 we again use coordinates such that dU 
is a surface of constant x’, then the restriction of @(£) to OU is again 

Olay = Edx?n...nde” = dxl(E)dx?2n...n de”. 
More generally, if 7 is a one-form normal to OU (i.e. Π(η) = O on any vector 7 
tangent to 0U), and if @ is any (n — 1)-form such that 
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~ 


G=nadad, 
then we get 0()lay7 = A(E)@lay. This gives (4.74) in the form 
9 [, (ἄνο ἐ)ῶ = i Ae, (4.78) 


where @ is restricted to 0U and fi n & = G3. If the coordinate system for (4.76) 
covers all of U, then this is 


| ο... f en,” x, (4.79) 


which is the usual version of Gauss’ theorem in R”. 


Exercise 4.19 

Show that, although @ as defined in (4.78) is not unique, the restriction 
ἄ]ου is unique once 7 is given. Show that 7 is fixed up to the scale 
transformation Π > fii, where fis any nowhere-zero function, and so 
show that ἄ]ου is unique up to ἄ]οι; > f | @lay. In this way conclude 
that 7(E) alr, is unique. 


The arbitrariness of @ in the definition we have used for the divergence of ἕ 
can be eliminated if there is a metric, by using the metric volume element 
(64.13). This is ambiguous up to a sign but (4.34) shows that diva is in fact 
independent of this sign. Equation (4.79) shows that the usual divergence in R” 
uses the form dx!a...A dx”, which is the metric volume element of the 
Euclidean metric. 


Exercise 4.20 
From (4.77) show that, if coordinates are chosen in which & 
= fdx'n...a dx”, then 


ἀνωξ = — (fe), : (4.80) 


Exercise 4.21 

In Euclidean three-space the preferred volume three-form is @ = dx 

A dy a dz. Show that in spherical polar coordinates this is @ = r? sin 6 
dra ἆθ κ ἀφ. Use (4.80) to show that the divergence of a vector ἕ 

= aa : 9/96 + ἕν . is, 

0g? 


divé = ο. ET) + ο £9) + δώ 


(a) 


(b) 


(c) 


(a) 


(b) 
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Exercise 4.22 

In fluid dynamics (and in many other branches of physics) one deals 
with the equation of continuity, which is written in ordinary tensor- 
calculus form as 


0p 
Or 
Here p is the density of mass (or other conserved quantity) and V is its 
rate of flow. Defining @ = dx a dy a dz as in exercise 4.21 above, and 
using the comoving time-derivative operator (9/9: + £7) (which we will 
discuss in detail in chapter 5), show that the equation of continuity is 


+ div(oV) = 0. 


ο ἐν] (ρῶ) = 0. 


This permits one to regard ρῶ as a dynamically conserved volume 
three-form on the fluid. The ‘volume’ it assigns to any fluid element is 
that element’s mass. 


Exercise 4.23 

Show from (4.77) that another expression for the divergence of a 
vector & is 

divoé = *d*é, (4.81) 
where the *-operation is the dual with respect to @ introduced earlier. 
For any p-vector F define 

div,,F = (— 1"? *d*F. (4.82) 
Show that diva F is a (p — 1)}-vector. Show that if 6 has components 
€;,_; in some coordinate system, then 

(div. Fie ο pried b (4.83) 
in those coordinates. 

Generalize (4.80) to p-vectors, 


Exercise 4.24 
On the sphere S* use Stokes’ theorem to prove that a two-form ὤ is 
exact (is the exterior derivative of another form) only if 


[ 3 = ο. 

Js? 

(Hint: S* has no boundary.) 
Show that the two-form 


@ = x!dx7a dx? 
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defined on R® has the following value when integrated over the unit 
sphere S? as a submanifold of κ”: 


[ο 4 
ω 
Ss? 


. = 37 
(Hint: what is d@ in Κ 33) Since any two-form on S? is closed (why?), 
this proves that not every closed two-form on S? is exact. 

(ο) Show that every closed one-form 8 on ο is exact. (Hint: integrate 
dg over a part of S? ) 





4.24 A glance at cohomology theory 

Exercise 4.24 above illustrates how Stokes’ theorem may be used to 
study those global properties of a manifold which determine the relation 
between closed and exact forms. Let Z?(M/) be the set of all closed p-forms on 
M (all & such that d& = 0) and let B?(M) be the set of all exact p-forms on M (all 
a@ such that & = dg). Both sets are vector spaces over the real numbers: for 
example, if & and 6 are closed p-forms then a@ + bf is also closed for any real 
numbers a and D. In fact, B? is a subspace of Z”, since ddp = 0. We now show 
how Z?(M) can be split up into equivalence classes modulo the addition of 
elements of B?(M). Closed forms @, and @ are said to be equivalent (@, ~ &,) 
if their difference is an element of ΒΣ(Μ): 


ἂι 5 & 5ᾶι -& = dp. (4.84) 
The equivalence class of @, is the set of all closed forms equivalent to it. The set 


of all equivalence classes is called the pth de Rham cohomology vector space of 
M, H?(M). 


Exercise 4.25 

(a) A relation ~ is called an equivalence relation if it has the following 
properties: (i) for any ἄ, & © &; (ii) if @ 5” β then 5” &; and (iii) if 
& * Band 6  ¥ then & ~ ¥. Show that (4.84) does define an equiva- 
lence relation. 

(b) If Z? and B? were, respectively, any vector space and a subspace then 
the set of equivalence classes we have defined is called the quotient 
space of Z? by B”, denoted by Z°/B”.. Show that this is a vector space. 
(You must define addition of equivalence classes. Prove and then use 
the following result: if &@, and @ are in equivalence classes A, and A, 
respectively, then the sum of any element A, and any element of A, is 


in the equivalence class of @ + ἄ,.) 
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(ο) Consider the vector space R” and its subspace Κ1.. consisting of all 
vectors of the form (a, 0) for arbitrary real a. Show that R?/R',. is the 
congruence of straight lines parallel to the x-axis. 


We can translate the result of §4.19 into the statement that for the open ball 
in n-dimensions or any region U diffeomorphic to it, H?(U) = 0 for p = 1, since 
all closed p-forms are equivalent to one another and hence to the zero p-form. 
It is also easy to compute H°(U), or in fact H°(M) for any connected manifold 
M. A zero-form is just a function, so Z°(M) is the space of functions f for which 
df = 0, i.e. the constant functions. This is simply R'. Moreover, since there is 
no such thing as a (— 1)-form, the space B°(M/) is just the zero-function. The 
equivalence relation © is therefore just the usual algebraic equality: constants f 
and g are equivalent (f ~ g) if and only if they are equal (f = g). Therefore 
H°(M) = Z°(M) = ΚΙ. If Mis not aconnected manifold then a function in 
Z°(M) need be constant only on each connected component of M, but may have 
different values on different components. Then H°(M) = Z°(M) = ΚΤΠ. where 
m is the number of components of M. 

Exercise 4.24 clearly can be generalized to any number of dimensions and 
shows that H”"(S”) # 0 (part (b)) and H” !(S") = 0 (part (c)). These are special 
cases of 


H"(S") — R', 
H®(S") = 0,0<p<n, (4.85) 
H(S") = R}, 


The proof of this plus many other interesting results can be found in Spivak 
(1970, volume 1). Among the many applications of cohomology theory in 
Spivak is the fixed-point theorem: for even n the sphere S” does not possess a 
nowhere-zero vector field. 


Exercise 4.26 

For odd n, find a nowhere-zero vector field on S”. (Hint: regard S?”"*? 
as a submanifold of R*™*? and consider the effect on S*”*! of the 
rotation corresponding to the following matrix of SO(2m + 2): T 

= diag(4,,A2,...,Am4+1), where each A; is the 2 x 2 matrix 


cos@ —siné 
A= 
sin 0 cos 0 
independently of 7. Show that the vector field d/d@ on S?"*! which is 


tangent to the congruence generated by this one-parameter subgroup of 
mappings does not vanish anywhere.) 
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Exercise 4.27 

(a) Generalize exercise 4.24(b) to show that the (n — 1} form defined on 
κ” 
a = €i. px'dxi an -..A dx” (4.86) 


is nowhere zero when restricted to the sphere 8S"! defined by (x!) 
+(7?P+...4+@"P ξι. 

(b) Show that H""'(S""') = R' implies that if @ is any (n — 1)-form on 
S”’' then &@ — αῶ is exact, where a = fgn-1 G/f gn-1 ©. 

(c) By taking the dual of this relation show that if fis any function on 
οἩ 1 it can always be represented in the form f = c + divgV for some 
constant c and vector field V on S™?, 

(d) For the circle S' prove that H'(S'!) = ΚΙ by constructing a function f 
for which df = & — a6, as in (b) above. 


Exercise 4.28 

(a) Suppose a one-form @& on M has the property fz @ = 0 for any closed 
curve & in M. Show that @ is exact, i.e. that there is a function f such 
that ἃ-- df. 

(b) A connected manifold ή is simply connected if every closed curve can 
be smoothly contracted to a single point. Show that M is simply con- 
nected if and only if H'(M) = 0. 


Before leaving the subject of cohomology we must take two short remarks. 
First, the dimension of H?(M) is called the pth-Betti number b? of M. Second, 
although our definition of H?(M) relied on the differential structure of the 
manifold, it is one of the most fundamental theorems of cohomology (the de 
Rham theorem) that the cohomology groups depend only on the topological 
structure of M and not its differentiability. See Warner (1971) for further 
discussion. 


4.25 Differential forms and differential equations 
The example mentioned in §4.17 of the way in which exterior differ- 
entiation has a natural relation to integrability conditions also illustrates that, 
at least for first-order partial differential equations, there is a natural way to 
write the equations as relations among forms. This is so important that we shall 
expand upon it here. 
Consider the equation 


dy 
dx . 1.2). 
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rewritten as 


dy = f(x,y)d. (4.87) 
On a two-dimensional manifold M whose coordinates are x and y, we are 
tempted to write the one-form equation 

dy — fdx = 0, (4.88) 
where f is now a function on M. What meaning does this equation have? Surely 
on such a manifold the one-forms dy and dx are linearly independent, so (4.88) 
cannot really be true: it is not an identity. But we do not expect it to be an 
identity. It comes from (4.87), which is a relation between ‘increments’ dy and 
dx for solutions only. A solution of (4.87) is a relation of the form y = g(x), 
which defines a curve (or a path, at any rate) in Μ: a one-dimensional submani- 
fold of M. Vectors tangent to this submanifold have slope dy/dx equal to f(, y). 
Consider one such vector V at some point P, with components (1, f(P)). For 
such a vector, dy(V) = f(P) and dx(V) = 1. Therefore the one-form in (4.88) is 
zero on V: 
4 (dy — fdx)(V) = 0. 
This is the meaning of (4.88): solutions to the original differential equation 
define submanifolds of M whose tangent vectors annul the form (4.88). Equa- 
tion (4.88) is true when restricted to this submanifold. Conversely, if there 
exist submanifolds whose tangent vectors annul (4.88), then these submanifolds 
are solutions of (4.87). Naturally there is not just one solution submanifold but 
a whole family of them, distinguished from one another by, say, the ‘initial 
value’ of the solution y at some fixed x = X9 (or equivalently by the arbitrary 
constant of integration in the solution to (4.87)). 

One can of course generalize this picture. Any given set of forms (not nec- 
essarily one-forms) {y;,i=1,...,N}$defines at any point P a subspace of Tp 
which annuls them. A solution to the forms (or to their associated differential 
equations) is the submanifold formed by the meshing together of these little 
tangent subspaces. The question of whether this meshing together is possible 
is clearly related to the theorem of Frobenius, proved in chapter 3. We reformu- 
late this theorem in the language of forms in the next section. 

The first question for the physicist, however, is usually to find the set (or a 
set) of forms which is equivalent to a given set of differential equations. An 
example of this is given in exercise 4.32, where the equations are first-order. 

A more complicated example is provided by the second-order harmonic oscilla- 
tor equation 
d*x 


a wx — 0, (4.59) 
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where w is, for convenience, taken to be a constant. To put this in the language 
of forms, we write it as two first-order equations: 


ax Y = ory 


ας ο a 
Then it is clear that finding a submanifold that annulls the forms 

@=dx—ydt, B= dvtwxdrt, 
is equivalent to solving (4.89). The whole manifold is three-dimensional, with 
coordinates (x, y, {). A solution submanifold is one-dimensional, since annulling 
ἃ and 6 amounts to two restrictions on the vectors at any point of the manifold. 
Further instructive examples may be found in the papers by Estabrook (1976) 
and by Harrison & Estabrook (1971) in the bibliography. We now turn to the 
problem of the existence of solutions to these equations. 


4.26 Frobenius’ theorem (differential forms version) 

We now return to one of the most important theorems of differential 
calculus on manifolds, whose Lie derivative version we gave in §3.7. In order to 
re-cast it in terms of differential forms we first need some definitions. A set of 
forms {@;} of any degree defines at each point P a subspace of vectors Xp of Tp, 
each of which annuls each B;- This is called the annihilator of the set of forms at 
Ρ. The complete ideal of the set at P is all the forms at P whose restriction to 
ΧΡ vanishes. (Notice that if ¥ is any form at P, ¥ a B; is zero when restricted to 
the annihilator of B;, and so is in the complete ideal.) Any such complete ideal 
has a set of linearly independent one-forms {a} which generates it, in the sense 
that the complete ideal of {@}is the same as that of {6;}. Exercise 4.29 con- 
structs such a set of generators. 


Exercise 4.29 

Let {@;,...,@,,}be a basis for Xp and augment it with any other set 
of vectors {@m+1,---,@,}to form a basis for ΤΡ. Show that the dual 
basis one-forms {@’"*', ... , 65"} generate the complete ideal. Show 
that any form in this ideal can be written as ΣΙ 44; ¥'A @' for some 
{7}. 

Exercise 4.30 

Let {0,7 =1,...,m}be aset of linearly independent one-forms. 
Show that any form Ύ is in their complete ideal if and only if 


YAQ, AG A...AQy, = 0. (4.90) 


The above algebra extends naturally to fields of forms. The complete ideal of 
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a set of fields {6;} is the set of fields which are annulled by the annihilator Xp 
of {B;3 at every point P. An ideal is said to be a differential ideal if, for every 

¥ in the ideal, 4 is also in it. A set of one-forms {1 15 said to be closed if each 
form da; is in the complete ideal generated by the as. 


Exercise 4.31 
(a) Show that a closed set of one-forms generates a differential ideal. 
(b) On an n-dimensional manifold, show that any linearly independent 
set of n or n — 1 one-forms is closed. 


The Frobenius theorem can now be stated: suppose {@;,i=1,...,m}area 
linearly independent set of one-form fields in an open region U of an n- 
dimensional manifold Μ. If and only if they are closed, there exist functions 
{P;;,Q;,i,7= 1,...,m}such that 


m 
+ a = >) Pz dQ;. (4.91) 


Before proving this in the next section, let us see what it means. We are 
looking for solutions of the differential equations {&; = 01, which are shown by 
(4.91) to be equivalent to (40, = 0}. But this latter set is easy to solve: {0; 
= const}. So the functions {Q;' are the solutions to the equations {δι = 0}. Each 
set of values {Q;} defines an m-dimensional submanifold of M. Its tangent 
vectors annul {dQ;} by definition and therefore annul {@;}. This is the link with 
our previous version of Frobenius’ theorem. The requirement that the set of 
one-forms be closed is the dual of the requirement that the set of vector fields 
annulling them be a Lie algebra, as discussed in more detail below. 

Forms {@;} satisfying (4.91) are said to be surface-forming. We can now 
establish the sufficiency of the integrability conditions discussed in $4.17. In 
that case the manifold had dimension two and the solution submanifold dimen- 
sion one. The equation 


a= df 
is of the form (4.91), so a function f exists if and only if da = 0. A more compli- 
cated example follows. 


Exercise 4.32 
Consider the set of coupled linear inhomogeneous differential equations 
for the functions f and g in the independent variables x and y 
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—+A,ft δισ 
Ox 


of 
oy 


dg 


—+Dyet kf 
Ox 


Ci, 


C2, 


Fy, 


8 4 Det Eaf = 
wt Dog t Eof = Fa, 
ν 
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(4.92) 


where A;, δι, C;, D;, ΕΙ, F; @ = 1, 2) are functions of x and y. We wish 
to establish the integrability conditions for these equations. 


Neue’ 


(a 


we define two one-forms 


α 


~~ 


B 


ᾱ - df t+f4+eB—C, 
dg+eD+fE—F, 


with the one-form A being defined as 
A= Αι dx + 42 ἄν, 

and similarly for B,C,.... Show that finding a two-dimensional sub- 
manifold # in M on which Gly = Bly =0 is equivalent to solving 


(4.92). 


In the four-dimensional manifold M whose coordinates are (x, y, f, g) 


(4.93) 


(4.94) 


(b) By Frobenius’ theorem, if (@, β) are closed, then there exist functions 
U,V,W,X, Y,Z of the four variables (x, y, f, g) such that 


@ = WdU+XdV, 
6 = YdU+Z dP. 


Show that 


U(x,y,f,g) = const, 
V(x,y,f,8) = const, 
defines a solution to (4.92). 
(c) By (b), a necessary and sufficient condition for a solution to exist is 
that the two-forms d@ and df be in the ideal of (ᾶ, 8). Show that this 
is true if and only if 


dA+BaE 
= dD+EaB 


dB+BaD+AnB = dC+BaFt+An 
dE+EAAt+DAE = dFtEaCtDa 


fF = 0. 


(Hint: the realization that by (4.94) dA is proportional to dx a dy helps 
simplify the algebra enormously.) 
(d) Show that the conditions in (c) lead to the integrability conditions for 


(4.92): 
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ὃ4ι dA, 

πο ποπ + Bok, —B,F, = 0, 

Oy Ox 

0B, dB, 

oa t+ BLD, + 4.81 —B,D, —A,B, = 0, 
oy Ox 

and so on. 


What does Frobenius’ theorem have to say about the existence of solutions 
to equation (4.89)? The answer is simple: since any two linearly independent 
one-forms in a three-dimensional manifold automatically have a closed ideal 
(cf. exercise 4.31(b)), there must exist functions f, g, h,1,m,n for which 
hdf + Idg, 
mdf + ndg. 

Then the one-dimensional submanifolds defined by f= const, g = const annul 
the forms a and B, and so are the solution submanifolds. 

Our version of Frobenius’ theorem does not directly deal with systems of 
differential equations described by sets of forms including two-forms or forms 
of higher degree. This case can be handled by finding a set of one-forms which 
generate the same complete ideal, as in exercise 4.29. It will not always be the 


Ὅοι ϱὶ 
| 


case that these one-forms are algebraically equivalent to the original set, i.e. they 
might not give differential equations equivalent to the original ones. If they do, 
Frobenius’ theorem applies directly. If not, then a more subtle approach is 
needed. See Choquet-Bruhat et al. (1977) for a discussion. 


4.27 Proof of the equivalence of the two versions of Frobenius’ theorem 
Let us recall the geometrically more transparent version given in 
chapter 3: a given set of g vector fields (γω, i=1,...,q}, which at every 
point form a p-dimensional vector space, will mesh to form a p-dimensional 
hypersurface if and only if all the Lie brackets [V;,, Vj] (7 =1,..., 4) are 
linear combinations of the g vector fields. The version given in this chapter 
involves forms and the closure of their exterior derivatives; this is a picture 
‘dual’, or complementary, to one with vectors and the closure of their Lie 
brackets. The key element in the correspondence between the two pictures is 
that if the vector fields define an r-dimensional subspace of Tp at a point P, of 
an n-dimensional manifold, then they define in a natural way an (n — r)- 
dimensional subspace of ΤΡ, the space of one-forms at P, by the requirement 
that the forms be annulled by the vectors. Conversely, the same requirement 
allows a set of g one-forms to define an (n — q)-dimensional subspace of Tp. 
What we have, in effect, is that a submanifold can be described either by giving 
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at every point the r-dimensional subspace of Τρ which contains the vectors 
tangent to it, or by giving the (η — r)-dimensional subspace of one-forms 
annulled by those vectors. The proof of the equivalence between the two 
versions of Frobenius’ theorem has two steps. 

(1) Consider a submanifold of dimension p in a manifold of dimension n: 
there are n — p different functions Ομ which (locally) define the hypersurface 
by the n — p equations Q(z) = const. The forms dO ce are, by hypothesis, all 
linearly independent, and they are all anulled by any vector V tangent to the 
submanifold: (dQ, zy, V) = 0. On the other hand, the tangent space to the sub- 
manifold is a p-dimensional vector space, which therefore defines a (n — p)- 
dimensional subspace of one-forms, such that any one-form β in this subspace 
is annulled by all the V.;y: (8, Vij) = 0. Let @yy,k =1,...,n—p} be any 
basis for the subspace. It is clear that the forms dO, ϱ) are also a basis, so that 
any @,) can be written as a linear combination of all the dO anys, as in equation 
(4.40). So the equivalence proof must now show that the condition on the 
vector fields — closure of their Lie brackets — is equivalent to the closure con- 
dition on the forms {@,)}. 

(2) This is done by beginning with the equation 

(αμ. γώ) = OG = 1,...,n—-psj = 1,...,P), 
and taking its Lie derivative with respect to any V(,): 
0 = ἐνιδω. Vay? = (Lo gy Ven + (Kay, LF) Vp? 
By the rules for the Lie derivatives of forms we have 
(Loy Mays γώ) = (diy, Vewys Veg) + (dG n(Viny), Vin? - 
The first term vanishes because Qi) is by definition annulled by V(z), while 
the second one is just oes V (jy), the value of da; on two vectors in 
the original set. Now, if £7 (ke Vj) is a linear combination of some Vj, then it 
annuls ἄ(ῃ and we have that t day is annulled by the {V;;)}.as well. Therefore, 
deci is in the ideal, and closure of the Lie brackets implies closure of the forms. 


Conversely it is easy to see that closure of the forms (which implies dein (V, k)> 
Vy) = 0) implies closure of the Lie brackets. 


4.28 Conservation laws 

A particularly nice approach to conservation laws for differential 
equations is afforded by forms. Suppose solving a system of equations is equi- 
valent to finding surfaces that annul a certain set of forms {&,}. Suppose further 
that there exists a form ¥, a linear combination of {@;}, 


Y= A a, +..., 
such that 
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dy = Q. 
Then there exists another form 6 such that 

y= de, 
in a suitable region U of a solution surface H, and 

Ve = doly = 0. (4.95) 
Applying Stokes’ theorem to the integral of d& on the region U of H gives 


[ do = δ. 
U dU 
But by (4.95) this vanishes: 


bu Olu = 0 
on the boundary of the region of a solution surface. This is a kind of integral 
conservation law, as we now illustrate for the harmonic oscillator. 
The solution surfaces of (4.95) are one-dimensional curves, so the form 
do must be a one-form, and @ is in fact a zero-form (a function). Since d@ is 
the same as Ύ, consider the form (notation same as before (§4.25)) 


~~ 


7 = w*x&+ γβῇ. 
It is easy to verify that 


dy = 0, (4.96) 
and in fact that 
y= dy? +4w?x?). (4.97) 


Then on a solution curve, for which & = 8 = 0 and hence ¥ = 0, we have that 
d@y? +3w°x?) = 0, 
0 =| ddy? +40?) 
= Gy? + 30° x? IP 
where p, and p, are the endpoints of the region of the curve we integrated 
over. This just expresses the constancy of the energy, $y* + 4w7’x’, along a 
solution curve. 
For an application of this point of view to equations having soliton solu- 


tions, the interested reader is referred to Estabrook & Wahlquist (1975) (see 
bibliography). 


Exercise 4.33 
Verify equations (4.96) and (4.97). 
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4.20 Vector spherical harmonics 

We resume here our discussion of spherical harmonics in §3.18. In that 
section we noted that a finite-dimensional representation of SO(3) in the space 
of functions on S?, L?(S”), had the basis {Y,,,,m =—1,..., 1}. How do we 
create a related basis for vector fields on S”? The space of all vector fields can be 
given a natural norm in terms of the metric g| of S?, whose components are 
(σρο = 1, S49 = sin? 6, Zog = 0} in the usual spherical coordinates. If we let & 
be the metric-induced volume form on S? (§4.13) then the space 11 9(S”) is 
the vector space of all vector fields V on S* whose norm 


WIP = {o GV, V) @ (4.98) 


is finite. What we want are vector fields in Li 6 (52) which are eigenfunctions of 
|, and L?. 

We use two facts: first, Οἱ and hence @ are invariant under /, and L? ; and 
second, exterior differentiation and Lie differentiation commute. From the 
function Y,,, we construct the one-form dY im, and from it the vector VY), 
with components (indices A, B run over 1 and 2) 


(WYim)* . g°? (Yim) ,B: (4.99) 
Evidently this is also an eigenfunction of J, and L?: 

τσι = im VYim, (4.100a) 

L?(VYim) = —1d+1)VYim- (4.100b) 


But we cannot stop with one sort of vector harmonic, since we need to span a 
two-dimensional vector space. Here we take advantage of the fact that there is 
another way (on a two-dimensional manifold) to construct a vector from a one- 
form: the dual operation. So we also have *dY,,,, which is of course also an 
eigenfunction. 


Exercise 4.34 
Show that VY;,, and *dYj,, are in general linearly independent vectors 
at each point. 


It follows from the completeness theorem quoted in $3.18 that the two sets of 
vector spherical harmonics 


4 Yin ση. (41014) 
+ Yim = *dYim (4.101b) 
form a complete set for representing vectors on the two-sphere. 

It is possible to follow this procedure further and define second-rank tensor 
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spherical harmonics. This would, however, involve us with the covariant deriv- 
ative on the sphere, which we have not yet discussed (see chapter 6). Interested 
readers may consult the paper by Regge & Wheeler (1957) listed in the biblio- 
graphy. 

Note that we have discussed only scalars and vectors on the sphere. Most 
applications involve larger manifolds with spherical symmetry, in which the 
spheres are submanifolds. As a simple example, consider three-dimensional 
Euclidean space E°. A function on £° can be expanded in a series 5 fj, (1) 

X Yim» where its r-dependence is entirely contained in {f,,, }. A vector field V on 
E? can be split into two fields 

V=V,+V,, 
where V, is perpendicular to the spheres (parallel to @,) and V, is tangent to 
the spheres. If we write V, as vé,, where v is a function, then under a rotation 
v transforms as a scalar function on the sphere while V; transforms as a vector 
on the sphere. So V, must be expanded in terms of vector spherical harmonics 
while v is expanded in scalar spherical harmonics. (Many authors multiply these 
scalars by 6, and call the resulting set a third kind of vector spherical harmonic.) 
We shall employ these in our examination of cosmological models in chapter 5 
part E. 

There are other equivalent formulations of vector spherical harmonics which, 
at first sight, seem to have very little to do with the ones defined here. These are 
defined by the algebraic methods of group theory (cf. Edmonds, 1957). The set 
presented here are convenient to use in differential equations, where the deriv- 
atives we have used occur naturally. 
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5 APPLICATIONS IN PHYSICS 


A Thermodynamics 


5.1 Simple systems 
We confine our attention at first to a one-component fluid, for which 
the equation of conservation of energy is 


δο = PdV+dU, (5.1) 


where U is the internal energy of the fluid and δΟ is the heat absorbed as the 
fluid does work PdV and changes its energy. We shall interpret this equation as a 
relation among various one-forms in the two-dimensional manifold whose 
coordinates are (V, U), on which the function P(V, U) is defined (called the 
equation of state). Then since dV and dU are one-forms, so is 50. But is 50 an 
exact one-form? That is, can one find a function Q(V, U) such that 50 = do? If 
this were true, then one would have ddQ = 0, which would mean 


OP\ w~ OP\ ~ 
στι dV+ (—] dU 


OP\ ~ ~ 
—| ἀλλά’. 
νι 


(Subscripts on derivatives indicate which variable is fixed during differentiation.) 
Thus, a function Q can exist only if (0P/dU)y vanishes everywhere: this would 
be a strange fluid indeed! 

Since 50 is a one-form in a two-space, its ideal is automatically closed, so by 
Frobenius’ theorem ($4.26) there must exist functions T(U, V) and S(U, V) 
such that δΟ = TdS. Thus, we define the temperature and entropy functions 
for the single-component gas in thermodynamic equilibrium simply as a repre- 
sentation of the one-form in equation (5.1): 


9 TdS = PdV + du. (5.2) 
It is important to understand that this is a purely mathematical definition of T 


and S, and it has no relation to the second law of thermodynamics, which we 
will consider in a moment. No mathematical identity of this sort would hold for 


0 = ἀρλάν = a dV 
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a multi-component fluid. (We shall see that the second law of thermodynamics is 
equivalent to requiring 50 = TdS for composite systems. Because this is not an 
automatic identity, the second law is a physical law: it restricts the possible 
mathematical nature of physical systems.) 


5.2 Maxwell and other mathematical identities 
Taking the exterior derivative of (5.2) gives 


ἄΤλ dS = dPa dV. (5.3) 


Suppose we write T = T(S, V), P= P(S, V). Then (5.3) gives (since dSadS =0, 
dVa dV=0O): 


aT\ ~~ [aP\ ς ορ)... - 
—) dvads = (=| dsadv = —|<] ava ds. 
2 ) * I, * 2) * 


From this we conclude 


oT oP 
(27) = — e). (5.4) 


which is known as one of the Maxwell identities. Similarly, by writing S = S(T, 
V),P =P(T, V), we can deduce 


95 oP 
LB . 


another Maxwell identity. By dividing (5.2) by Τ and then taking the exterior 
derivative we get 


1 ~ ~ Pex ~ 1 ~ ~ 
γη Λ πο aE a VV — 73 OT A dU = 0. 


By writing U= U(T, V), P= PCT, V), we get 


1 [οἱ ~~ Pri w~ 1 /0U\ ~~ 
ποστ dTa ava APA γ΄ --τ-ς ἀΤλάΞ 0, 
ν 


T 19Τ Τ21ὸΥ// 

. oP ου 
ΤΙ —-P={=—|]. 5.6 
η, (4) | ο 


Exercise 5.1 
Derive the identity 


oP oP ου δΡι {dU 
ο] .. (=) (22) -(%) (24) 6 


by multiplying (5.2) by 1/P and differentiating. 
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Another important relation which follows easily from the use of forms is 


ΟΤΙ] (951 {oP 
rl ler),las), =~ aa 


which is equally true of any set of three of (P, V, U, T, 5). We prove this by 
writing 
T = T(P,S),S = S(T,P),P = P(T,S), (5.9) 


which is possible since the manifold is two-dimensional. Then we have the 


successive identities: 

oT \ ~ ~ 

ar] dPa dS 

ΟΡ Js 
or Os dP a dT 
ΟΡ /g\ OT |p 


aT\ (951 (ορ. ~ 
- (“| (2) [| as, ar 
bakeanegl ou 


from which follows (5.8). Notice that the derivation here relies only on the 
ability to write (5.9), so that it is really an identity among any three functions 
on a two-dimensional manifold. 

The ease with which the Maxwell identities and (5.8) can be derived using 
forms is an illustration of the natural way in which they fit into thermodynamics: 
the one-forms dP, ds , etc. are the mathematically precise substitutes for the 
physicists’ rather fuzzier concept of the infinitesimals dP, dS, etc. 


dT a ds 


5.3 Composite thermodynamic systems: Caratheodory’s theorem 
We now consider composite thermodynamic systems, the parts of 
which may exchange energy with each other and with the outside world. In this 
case the law of conservation of energy is (for a system with Ν parts) 
δο = P,dV,+ dU, +P,dV,+ dU, +... 
N 
= ) (P,dV, + dU,). (5.10) 
i=1 


We regard this as a relation among one-forms on a 2N-dimensional manifold 
whose coordinates are (V;, U;;i=1,...,N), and we assume that each P; can be 
expressed as a function of these coordinates. The question arises of whether one 
can define an entropy and temperature for the system as a whole, i.e. whether 7 
and S exist such that 


+ δο = TdS. (5.11) 
This equation is just the statement that 50 is integrable (in the sense of the 
Frobenius theorem). Now the Frobenius theorem tells us that the necessary and 
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sufficient condition for this to be true is d5Q a 5Q = 0. It is easy to see from 
(5.6) that this will not generally be true, so we can conclude that for a general 
interacting system there is no global temperature or entropy function. But the 
situation can be different for an equilibrium system, because the conditions for 
mechanical and thermodynamic equilibrium among the constituent parts restrict 
the problem (we assume) to a submanifold of the 2V-dimensional one. We shall 
from now on let the world ‘manifold’ refer to this equilibrium submanifold, and 
examine the possibility that 5Q is integrable in it from the point of view of 
Caratheodory. 

If 50 is integrable, then every point of the manifold is on one and only one 
integral submanifold; these submanifolds are defined by S = const. None of these 
surfaces intersect. Therefore, starting at one point, it is not possible to reach an 
arbitrary point of the manifold along a curve on which δΟ is everywhere zero. In 
other words, if an entropy function exists it is not possible to reach every equi- 
librium state of the system along an adiabatic path of equilibria. The physically 
interesting question is whether the converse is true: if we know that not every 
state is reachable along a path for which δΟ = 0, can we say that δΟ is inte- 
erable? This is interesting because one version of the second law of thermo- 
dynamics asserts that it is impossible in a closed system to transfer heat from a 
colder to a hotter body without making other changes as well. By a closed sys- 
tem we mean one for which δΟ = 0, so that the second law tells us that not 
every state can be achieved with 6Q = 0. So does the second law imply the exist- 
ence of an entropy function? Caratheodory’s theorem says it does. 

What we shall prove is that if 5Q is not integrable then all points in the neigh- 
borhood of some initial point P are reachable from P on a curve which annuls 
5Q. Since δΟ is not integrable, the version of Frobenius’ theorem given in $4.26 
shows us that there are at least two vector fields V and W for which 5Q(V) 
= 5Q(W) = 0 in a neighborhood of any point P, but 5O([V, W]) #0 at P. That 
is, the one-form δΟ defines at each point Ρα subspace K p of Tp, the vectors of 
which annul 50; the nonintegrability of 50 means that vector fields everywhere 
in Kp do not form a hypersurface: at least one of their Lie brackets does not lie 
in Kp (see figure 5.1). Because annulling 50 is only one equation, K p has 


Fig. 5.1. The tangent hyperplane K p contains the vectors annulling 50 
but not all of their Lie brackets at P. 


[V, W] 
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dimension n — 1, where n is the dimension of the equilibrium manifold. Now, 
recall the exponentiation notation for the Taylor series introduced in §2.13. If 
we take any vector field U which is in Kp at all points P, and we move along it a 
parameter distance ε from P, we reach the point whose coordinates are 

* = exp (eUV)x'|p, where we use U as a derivative operator on the function x’ 
along the curve. The set of all points in a small neighborhood of P reachable in 
this way may be called exp (€K p): it is the representation in the manifold of the 
vector space K p. This set of points is locally like a piece of an (n — 1)-dimensional 
hypersurface. We shall show that, by following the curves of V and W defined 
above, we can reach points ‘above’ or ‘below’ this ‘hypersurface’ — i.e. that we 
can reach all points near P. The trip we make is the following: we move first a 
distance ε along V, then ε along W, then — € along V, and finally — € along W. 
This takes us to (cf. equation (2.6)) 


i 


x ~eW eV .eW EV, i 


=e lp 

= (1+ €?[W, V] + O(e?))x'lp. (5.12) 
This means that we wind up almost back at P, but a parameter distance εὖ away 
from it along [V, W]. This point is not in exp (eK p), since [V, W] is not in Kp. 
It is on one side of exp (€K p); to finish on the other side we would have trav- 
elled first on W, then on V. Now, our path was along V or W everywhere, so it 
was adiabatic: δΟ = 0 everywhere. It is clear, therefore, that if 50 is not integ- 
rable, all states of the system will be reachable along adiabatic paths. This proves 
that the second law requires integrability of 5Q in the equilibrium manifold and 
the existence of an entropy function for composite systems in equilibrium. 


B Hamiltonian mechanics 


5.4 Hamiltonian vector fields 

The Hamiltonian version of a dynamical system of equations begins 
with the Lagrangian στα, ᾳ |) for some dynamical variable g(t). The momentum 
p is defined as 


p = ὃσ]δ(ᾳϱ, (5.13) 
and the Hamiltonian Η as 

H = pqi-—L = H(p,4q). (5.14) 
The dynamical equation 

ά ὃσ of 

—>—_ —-~— = 0, (5.15) 

dt og;  0q 


and the definition of p can be written, respectively, as 
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oH op an off = 4 ; 

0g dt ’ dp  ἀί 
We now make a geometric picture of Hamiltonian dynamics by defining a mani- 
fold M called ‘phase space’, whose coordinates are p and 4. On M we define the 
two-form 
+ 6} = dqa dp. (5.17) 
Consider a curve {q = f(t), p = g(t)} on M which is a solution of (5.16). Its tan- 
gent vector, U = d/dt =f, 0/dq +g; 0/dp, has the property 


(5.16) 


ϕ {.πῶ = 0, (5.18) 
as we shall now prove. Since da = 0, we have from (4.67) 
£50 = d[a(0)]. (5.19) 


But since @ = dq ® dp — dp ® dq, we have 
6(U) = (dq, U) dp — (dp, U) dq 


= — dp—— dq. 5.20 
ae ay (5.20) 
On the other hand, since fand g satisfy (5.16), we have 
ο 9Η. 9Η. ~ 
@(U) = —dp+—dq = dH. (5.21) 
Op oq 


Therefore d[(U/)] vanishes, establishing (5.18). A vector field U that satisfies 
(5.18) is called a Hamiltonian vector field. 


Exercise 5.2 
(a) Prove that if U is a Hamiltonian vector field, there exists some A(p, q) 
such that equations (5.16) are satisfied along the integral curves of U. 
(b) Prove that Hamiltonian vector fields form a Lie algebra. 


By exercise 5.2(a), we interpret U as a tangent to the solution curves in phase 
space if U is Hamiltonian. Notice that the system is conservative, since (5.16) 


implies 
dH 
£7H = — = 0. 2 
U dy 0 (5.22) 
5.5 Canonical transformation 


Now the coordinates p and q are not unique. We define a canonical 
transformation as one which leaves G in the same form. That is, new coordinates 
P = P(q, p) and O = O(q, p) are called canonical if 
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dg a dp. = dQ a dP. (5.23) 
The necessary and sufficient condition for this is 
90 0P 90 οΡ 
09 of ο = 1. (5.24) 
dq op ορ oq 
One such transformation is Q = p, P= —q. A less trivial one is found if we 


follow a procedure similar to the one we used to deduce the Maxwell identities 
in thermodynamics: we write p = p(q, Q), P = P(q, Q) and find from (5.23) that 


9ρΡ/90 = — doP/dq. (5.25) 
So if we take an arbitrary function F(q, Q) and define 
p = 9Ε/9ᾳ. P = — 0F/0Q, 


then (5.25) is satisfied identically. Thus, F(q, Q) is said to generate a canonical 
transformation. Since we could have chosen, instead of (ᾳ. Q), the pairs (q, P), 
(p,Q), or (p, P) to be independent in (5.23), there are clearly four types of such 
generating functions for canonical transformations. They are explored more 
fully in Goldstein (1950) (see bibliography). 


5.6 Map between vectors and one-forms provided by 3 

One of the most important features of this geometrical point of view on 
Hamiltonian dynamics is that @ can be cast in a role similar to that which a 
metric plays on Riemannian manifolds: it provides an invertible 1-1 mapping 
between vectors and one-forms. If V is a vector field on M, we define a one-form 
field 


~~ 


V = GV), (5.26) 
with components 
(V); = wyV!. (5.27) 


Similarly, given a one-form field &@ we define a vector field a as the (unique) 
vector such that 


ᾱ-- ὤ(α). (528) 


Exercise 5.3 
Prove that (V, V) = 0, so that ὤ is not suitable as a metric. 


Exercise 5.4 
Prove that if a = fdq + gdp, then 


αξρ---- Γτ-. (5.29) 
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Exercise 5.5 

Prove that ¥ is a Hamiltonian vector field on M if and only if Χ is an 
exact one-form, i.e. if and only if there exists some function H such 
that Y= dH, or X = dH. 


5.7 Poisson bracket 
Suppose there are two functions f and g on the manifold, and we define 
the vector fields X; = df and X, = dg. Then consider the scalar 
{fg} = OX, χε = (df, XQ). (5.30) 
Since ὦ = dq &) dp — dp &) dq, we have 
— og 0 og ὃ 
Xg = TTD ; 5.31 
& 0g 0p ὃρ ag ( ) 
which can be established by verifying that @(X,) = dg. Therefore we have 
Og Of ορ of 
0q 0p dp dq 
This is what is usually called the Poisson bracket of the functions f and g. The 
definition (5.30) gives it a geometrical significance, and shows that the Poisson 
bracket is actually independent of the coordinates. It depends only on 6. 


Exercise 5.6 

(a) Defining X,, = dH, show that for any function K, 
{K,H} = X,(K) = dK/dt, (5.32) 
where ¢ is the parameter such that Y;,; = d/dt. Thus, the Poisson 
bracket of a function with the Hamiltonian gives the time-derivative of 
that function along a solution curve. In particular, constants of the 
motion have vanishing Poisson bracket with H. 

(b) Show that the Poisson brackets satisfy the Jacobi identity 


fig, us + tg, th, £55 + th, (hast = 0 (5.33) 


for any C” functions f, g, h. 
(c) Show from this that 


[Χ,, χο] = — Xf g}, (5.34) 


so that the Hamiltonian vector fields form a Lie algebra. 


5.8 Many-particle systems: symplectic forms 
In general one deals with systems which have more than one degree of 
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freedom, so there are more than one q and p. A particle in three dimensions has 


3 qs and 3 ps, so phase space is 6-dimensional. A system containing Ν such par- 
ticles has a 6.V-dimensional phase space. If we consider now a general system with 
n degrees of freedom, then phase space is 2n-dimensional, and all the above 
results still hold if we take the two-form ὦ to be 


4 


ὢ = 2 dq4 a dpa. (5.35) 
τι 


Such an @ is called a symplectic form, and then phase space is a symplectic 
manifold. 


5.9. 


(a) 


(b) 


Exercise 5.7 


Show that fis a constant of the motion if X;= (df ) is an invariant of 
Hie. 
τα. = 0. (5.36) 


(Refer to exercise 5.6.) 


Define a volume-form o for phase space by 
G = ὤλ...Λλῶ,. (5.37) 
Ne” 
n times 


where 27 is the dimension of the space, Show that o #0 and that a 
Hamiltonian vector field U is divergence-free in this volume measure. 
Said another way, this volume in phase space is preserved by the time- 
evolution of the system. This is known as Liouville’s theorem. 


Exercise 5.8 

We now prove the remarks made in 53.12 about the relation between 
Killing vectors and conserved quantities. For particle motion the 
coordinates of phase space are {g“, p, } = {x', p; = mv;}and the 
Hamiltonian is H = (1/ 2m)g" pp; + @(x'). Prove that if Vis a Killing 
vector and if ® is constant along U, then its conjugate momentum, 

g = U'p;, is a conserved quantity. Hint: using exercise 5.7, define X, 
as the vector field in phase space whose space components equal U and 
whose momentum components vanish. Show that 
Lx , = 0, 
and find f from equation (5.31). 


Linear dynamical systems: the symplectic inner product and conserved 
quantities 
Even more strikingly simple ways of formulating conservation laws are 
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possible for linear systems, by which we mean dynamical systems whose Hamil- 
tonian has the form 


H= > (T4?pappt Vapqq?), (5.38) 
A,B=1 


where and V4, are independent of the p,s and q“s. This system is called 
linear because the equations of motion are linear in {q“, ρα }: 


TAB 


dpa 9Η B 
στ agh Van (5.39) 
dg* 9Η 
Α 
Notice that we can take Γ43 = T?4 and V4p = Vga, since the antisymmetric 


Τ48 would make no contribution to H when contracted with the 


part of, say, 
symmetric expression D4Pp.- 

The linearity of the system ensures that if {q@4), Paya} and {4%), Pra} are 
solutions then so is {αφ + Baty; OPA + Piya } for arbitrary constants a and 
8. Thus, this phase space is not just a manifold; it has a natural vector-space 
structure as well. A vector space is, of course, a kind of manifold, since it has a 
map into R”, but it is a manifold which can be identified with its tangent space 
at every point. That is, since a curve in a vector space is a sequence of vectors, 
the tangent to the curve is just the derivative of the vectors along the curve, 
which is another vector, i.e. another element of the vector space. A vector space 
is its own tangent space. More than this, all the tangent spaces Tp have a natural 
identification with each other: we are able to speak about vectors in different 
Tps as being equal or not, simply by whether or not their components are equal. 
(This means a vector space is a flat manifold: see chapter 6.) 

Since a point in phase space is a vector, we can use the symplectic form @ to 
define an inner product between elements of phase space. If Y,) is the vector 
whose components are {9(4), Paya, 4 =1,...,N}and if Y.) similarly has 
components {a6 , Paya }, then their symplectic inner product is defined as 


(Yay, Yay) = 2 (Pa — W)Paya): (5.41) 


If ¥,)(t) and Y;.)(t) are solution curves, then their symplectic inner product is 
independent of time {. To prove this, we simply substitute the equations of 
motion into the expression for d@(Y(4), Y2))/dt (sum on repeated indices here): 


do. =o d d 
dt ὤ(Υω. Yay) = dt (94) Pia + at) qr A 


d d 
— dt (96))Paya .. πο gp WA 
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= TA pypPaya + VaBdGy{@ — T?®payaPos 
_ VapUa dy: 
From the symmetry of 748 and V4, we conclude: 


d= = 
αγ Lay, ey) = 0 (5.42) 


if Y4)(t) and Σ(1) are solutions. 

The symplectic inner produce enables us to define in an elegant way certain 
conserved quantities associated with solutions. At first sight this may not be 
obvious: although the symplectic inner product is conserved, the symplectic 
inner product of a solution with itself vanishes identically. The trick is to use 
an invariance of the system (1.9. of 748 and V4,) to generate from one solution 
Y another closely related one. For example, suppose 74? and Υλη are indepen- 
dent of time. Then the equations of motion tell us that if Υ(1) is a solution, so 
is dY/dt. We define the canonical energy E, of the solution Y to be 


dt 


It is easy to verify that Ε(Υ) is just the value of the Hamiltonian on the 
solution Y. 

Other conserved quantities are just as easy to derive. It usually happens that 
Τ45 and V,p depend on the coordinates {x'} of the manifold in which the 
dynamical system is defined (Euclidean space for nonrelativistic dynamics). If, 
as in exercise 5.8, there is some vector field U for which 

£574? = 0 = £oV ap, (5.44) 
then there is a conserved quantity associated with U. (In computing £5777 it is 
important to distinguish between indices A, B which refer to coordinates in 
phase space and the tensorial character of 7“ on the original manifold. The 
quantities 74 may be scalars, or tensors on the original manifold, depending 
upon whether the quantities g“ are scalars or tensors of higher order. The 
indices A and B are labels; they do not imply that 74% should be treated as a 
tensor of type (4) when computing the Lie derivative with respect to U, because 
U is a vector field in the original manifold, not in phase space.) As before, if Y is 
a solution, then so is £;Y. (Again the same remark applies: this is a derivative in 
the original manifold, not in a phase space.) We therefore define the (conserved) 
canonical U-momentum 
+ PAY) = ὅ(αργ. Υ). (5.45) 
The reader is invited to try a simple example, such as the one given in exercise 
5.8, to verify that the usual conserved quantity does indeed appear. 

Although our discussion has been confined to systems with a finite number 


+ EA(Y) = a(t | (5.43) 
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(V) of degrees of freedom, the formalism generalizes in a straightforward way to 
continuous systems, such as wave equations. Readers familiar with the Klein— 
Gordon equation may recognize the symplectic inner product: the integral of 
the conserved Klein-Gordon current density ψ ἵψ — Wy" is just (to within con- 
stant factors) @(W*, ψ). A discussion of the canonical conserved quantities for 
waves in fluids, with application to questions of stability, can be found in 
Friedman & Schutz (1978) (see bibliography). 


5.10 Fiber bundle structure of the Hamiltonian equations 

Our original statement in §5.4 that we defined phase space to be the 
manifold whose coordinates are p and q, hid a lot of interesting and important 
structure. Suppose a dynamical system has the N coordinates {q'} corresponding 
to its Ν degrees of freedom. These define a manifold called configuration space 
M, and the evolution of the dynamical system in time is described by a curve 
q'(t) in M. The Lagrangian 7 is a function of g’ and dq'/dt, and so is a function 
on 7M, the tangent bundle of Μ. We now show that the momentum 


ρι = ὃσ]ο(ᾳ 4), (5.46) 


is a one-form field on M, a cross-section of the cotangent bundle T*M. We show 
this by its transformation properties. Let us define new coordinates for M 


Q’ = οἳ(ᾳ). (5.47) 
Then the new momenta are 
af ο dq", 
P; = ani. — ο j' . 
9ο st og it 90 it 
Now, both αι and Q! , are elements of the fiber over any point P, and coordi- 
nates on this fiber undergo a natural change induced by (5.47). That is, if V is 
any vector at Pits components change by 
vi=N,vP, VR = Ate, 
This applies as well to the velocity vector q” 


wt oq” 
αι = MQ! επ at = ΛΑΟ. 


ag’ 


(5.48) 





Using this in (5.48) gives 
Ρ, = A*pp,, (5.49) 
so that the momentum is indeed a one-form. 
It follows that phase space, whose coordinates are {q’, p;}, is nothing but the 


cotangent bundle T*M, and the Hamiltonian is a function on this bundle. What 
is more, the symplectic form, 


= αφ) a dp;, 
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(summation convention employed) is independent of the coordinates in M. The 
transformation for it is 


Q' = ο) (q') == dg! = Μιά, 

P; = AM Dp —> dP; = A¥ 1η dq! + AP dpp. 
(Remember that this d operator acts in 7*M, not in M, and that the functions 
A*, are functions only of the coordinates of M). Then we find 

do? yn dP) = AP ;A® pp dq’ κ dq! + Mi jA* dq’ a ἆρν. (5.51) 
Now we also have 

AAR) = δν NAR) = — Ni A*y. 

So (5.51) becomes 


(5.50) 


dor Λ dP; = Ni, M py dqi Λ dq’ + dgi Λ ἄρι. 
The first term on the right-hand side vanishes because 
4 of 
7 _ Q 
i,l δα σα: 


is symmetric in 7 and / and is contracted with the antisymmetric form dq! A dq’ ; 
Therefore @ is independent of the coordinates of M and is a natural structure on 
the cotangent bundle T*M. Moreover, T*M is always orientable, since the 
volume-form o defined in exercise 5.7(b) is nowhere zero. 

Clearly, although our examples treated the fiber structure as trivial (i.e. as a 
product of the q-space and p-space), it is possible to have nontrivial manifolds M 
and fiber bundles 7*M, in which all the coordinate-dependent formulae above 
are valid only in local coordinate patches. Even an example as simple as that of a 
bead constrained to move on the surface of a sphere has a nontrivial bundle 
structure for phase space, as we pointed out in §2.11. 


C Electromagnetism 


5.11 Rewriting Maxwell’s equations using differential forms 
Maxwell’s equations, written in conventional form but with units where 
C= Up = €9 = 1. are 


VxB-<E = And, (5.52a) 
9 

VxEt ο. Ξ 0, (5.520) 
V-B = 0, (5.52c) 
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In writing these equations we have, of course, used the curl and divergence 
operations of ordinary flat three-space. 

What we shall show below is that there exists a way of writing these equations 
using only the concepts of the metric and the exterior derivative. First we rewrite 
the equations in their relativistically invariant form" by first defining the 
Faraday two-form F, whose components are 

O -ς -ν —E£, 
Ey 0 Bz, —By 
—B, 0 B,. 
E, Βν —B, 0 


+ (Fuv) = (5.53) 


(Here, as in §2.31, Greek indices run over ft, x, y, Z.) 


Exercise 9.9 
Prove that under a spatial rotation F,,, transforms in such a way that 
both E and B transform as three-vectors. 


In terms of the Faraday tensor, Maxwell’s equations take a particularly simple 
form. For instance, the four equations (5.52b, c) are just 
Fryv.y = OS dF = 0, (5.54) 


where we have used the square-bracket notation to denote antisymmetrization. 


Exercise 5.10 
(a) Prove that (5.54) constitutes four linearly independent equations. 
(b) Evaluate (5.54) for the components of F given by (5.53) and prove 
their equality to (5.52b, ο). 


As for the rest of the equations, if we introduce the special-relativistic metric 
whose components in this coordinate system are 


-ι 0 ο 0 
0 1 ο ο 
= 5.55 
(Suv) ο 0 1 ο (5.55) 
ο ο ο 1 


T Ror readers to whom this is unfamiliar, recall that Maxwell’s equations are the 
correct theory for light and that special relativity was invented to explain certain 
properties of light, so the theory is already relativistically correct. All we do here 
is to find a convenient form for the equations. 
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then we can define an antisymmetric (6) tensor F whose components are 
Fle = gle orb Fg, 
0 E Ey Ε, 


(FRY) = (5.56) 


Exercise 5.11 
Prove equation (5.56). 


Then the remaining equations are 
FH = 4nd", (5.57) 


where we have defined the current four-vector to have components {J‘ = p, 
J' =(J)' fori=x,y, z}. 


Exercise 5.12 
Prove that the four equations (5.57) are just the same as (5.52a~—d). 


So far we have stuck to Lorentz coordinates because, while (5.54) is 
coordinate-independent, (5.57) is not a valid tensor equation in every coordinate 
system (recall exercise 4.15). On the other hand, we saw in exercise 4.23 how to 
define the divergence of an antisymmetric (2) tensor (two-vector) if we have a 
volume-form. Because we have a metric, and because { 9/9, 0/dx, 0/dy, 9/92} 
form an orthonormal basis in this metric, the preferred volume-form is 

6 = dtadxa dy a dz. 


The following exercise develops the argument. 


Exercise 5.13 
(a) Define the two-form *F to be the contraction 


*F = 13(F), (5.58) 
1.6. 
Puy — Wo purl? 


This is, of course, the dual of F introduced in chapter 4. Find the 
components ( Ε)ιν in terms of E and B. 
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(b) Define the three-form ΤΟΥ the contraction 


ΤΞξ OV), (5.59) 
and show that (5.57) is equivalent to 

dF) = 4n°J. (5.60) 
By exercise 4.23 this is also 

div.,F = ἀπ]. (5.61) 


Note the great formal similarity between the two halves of our new form for 
Maxwell’s equations: 
4 dF = 0, (5.54) 
4 d*F = 4n*J. (5.60) 
Note also that they now are completely coordinate-free, so they have this form 
in any manifold with metric (because the metric was needed to obtain *F from 
F). The similarity between (5.54) and (5.60) is deep in Maxwell’s equations. 
Note that the * operation on F’ simply results in an exchange of E and B (cf. 
exercise 5.13(a)), and recall also that J was the electrical current density. If there 
were magnetic monopoles we would have two current densities, J, and J,,, and 
Maxwell’s equations would take the symmetric form 

dF = απ, d*F = 4n*Jq. (5.62) 


Exercise 5.14 

(a) Prove (5.62). 

(b) Prove by exterior differentiation that equation (5.60) guarantees con- 
servation of charge, i.e. that 


divVJ) = 0. (5.63) 


Exercise 5.15 
Establish the integral theorem for charge in the following way. 

(a) Choose any oriented three-dimensional hypersurface & and restrict 
(5.60) to it. Prove that restriction commutes with exterior differen- 
tiation, i.e. that 
dl(“Plgl = (Ply. 

(b) Choose a region Hof #, with boundary 0 Integrate the restriction 
of (5.60) over # and apply Stokes’ theorem to find (appropriate 
restrictions implied) 

[ F=—) F 
g Απ 'ο6) 
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(ο) In the case where & is a hypersurface t = const in Minkowski space- 
time and 0 Dis a sphere, show that this gives the total charge in Das an 
integral of the normal component of the electric field over 9.5). 


5.12 Charge and topology 

Since we can now formulate Maxwell’s equations on any manifold with 
a metric, we can mention two attempts which have been made to resolve the 
puzzling question ‘what is charge?’ by answering ‘charge is topology’. The first 
explanation, due to J. A. Wheeler (1962), is extremely simple. Consider figure 
5.2, in which a hypersurface t = const of some hypothetical spacetime is 
depicted. The lines drawn are integral curves of E. There is no charge density 
anywhere, and these integral curves are either closed (threading through the 
handle, out one hole, and down the other) or infinite (though they pass through 
the handle). Consider what an experimenter who measures E on the sphere S sur- 
rounding one hole will deduce: the integral {¢ F ς Will certainly not vanish (E is 
outward-pointing all over S), and he will say the hole has positive charge. Like- 
wise, a Sphere around the other hole would give it negative charge, of exactly the 
same magnitude. (The calculation of exercise 5.15 fails because S does not divide 
the manifold into an inside and outside, cf. figure 4.10.) So this is a model for 
‘charge without charge’, which has the bonus of explaining why negative charges 
equal positive charges. It has two drawbacks: first, no-one pretends to have a 
solution to, say, Einstein’s equations which gives a geometry for spacetime that 
looks like this; and second, it is perhaps philosphically displeasing to think of 


Fig. 5.2. A ‘wormhole’ or handle attached to a three-dimensional mani- 
fold with one dimension suppressed. Lines of force can thread through 
the handle, come out, and go backdown again to give each ‘mouth’ the 
appearance of charge in a charge-free space. 
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two charges, which may be separated by huge distances, linked together by their 
own special ‘handle’. 

The second explanation is more sophisticated, using a manifold made non- 
orientable by a special construction of the handle. This is due to Sorkin (1977) 
(reference in the bibliography of chapter 4). In this model, both holes have the 
same charge and so may be assumed to be close together, forming what to an out- 
side observer looks like a single charge of twice the strength of each hole. Here 
the breakdown in exercise 5.15 occurs because the manifold is nonorientable. 
This mode! overcomes the second objection to Wheeler’s picture, but not the 
first. And neither model explains why two unrelated charges should be equal. 
Nevertheless they illustrate a maxim which is becoming more convincing all the 
time: there is more to theoretical physics than just its local differential equations! 


5.13 The vector potential 

The existence of a ‘vector potential’ for Maxwell’s equations follows 
naturally from (5.54). Since F is a closed two-form, there is a one-form A such 
that 
4 F=dA (5.64) 
in some neighborhood of any point. This one-form can be mapped into a vector 
by the metric, and this is called the vector potential. A more natural concept is, 
of course, the one-form potential. Note that A is not uniquely defined: A =A 
+ df, for an arbitrary function f, also gives Fin (5.64). This is a gauge transfor- 
mation. Note also that if magnetic monopoles exist, then dF does not vanish 
everywhere. By our discussion of exact forms in chapter 4, it will be possible to 
define A only in simple regions which contain no magnetic monopoles. In par- 
ticular, in a region of spacetime containing the world-line of a magnetic 
monopole, the one-form potential cannot be consistently defined everywhere. 


Exercise 5.16 

(a) Show that, if a one-form potential A exists, then in nonrelativistic 
language it is related to the scalar potential @ and the vector potential 
A' by 6 = Ap, A! (vector potential) = — A, (one-form), where indices 
refer to the coordinates of (5.52). 

(b) Show how ¢ and A’ defined in (a) change under a gauge transformation. 

(c) To illustrate the problems caused to the one-form potential A by mag- 
netic monopoles, consider a situation with charges and no monopoles, 
but in which one defines a one-form potential a for *F by the equation 
"Fe = da. 


(By the duality between electric and magnetic fields under the 
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*-operation, @ should have the same problems with electric charge as A 
has with magnetic.) Write down Maxwell’s equations in terms of a and 
show that @ exists in regions that contain no charge and that can 

be shrunk to zero. Show this by finding an explicit solution for α in 
the case of a single isolated static charge q. 


5.14 Plane waves: a simple example 

Plane electromagnetic waves, as is well-known, travel at the speed of 
light. Consider a particular Faraday tensor ΓΔ, all of whose components are 
functions only of u = t — x (recall that we are using units in which ο = 1): 


Fe = Αἲβ(--κ) = A*B(y). (5.65) 
What are the conditions that this satisfy the empty-space equations dF =0, 
d*F = 0? From (5.65) we have 
dF ἄ(ξ Εμν dx! a dx’) = 5 d(Fy,y) A dx" a dx” 
4(dA,,,/du)du A dx” κ dx”. 
From (5.53) it is easy to deduce 


d .--- d -.,- 
dF = |— (8, —£,)dta dxa dy +—(@,,)dta dyn dz 
du du 


d ~~ ~~! ~~ d ο. ~ ~ 
+ —(—B,,)dx a dy a dz + —(—B, —E£,)dta dx a dz{, 
du du 


the vanishing of which implies (ignoring any static fields) 
B,= Ey, By =—E,, B, =0. (5.66) 


Exercise 5.17 
Show that the equation {3-0 implies 


B,=Ey, By =—E,,  E, = 0. (5.67) 


By this exercise we see that a plane electromagnetic wave has transverse electric 
and magnetic fields (i.e. perpendicular to its direction of propagation), and that 
these are determined by two independent functions, £,,(u) and £,(u), corres- 
ponding to the two independent polarizations of the wave. 


D Dynamics of a perfect fluid 


5.15 Role of Lie derivatives 
By a ‘perfect’ fluid we mean one which has no viscosity and moves 
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adiabatically, i.e. with no heat conduction. It is well-known that such a fluid 
obeys certain local conservation laws: during its motion any fluid element has a 
constant mass, entropy, and — in some sense — vorticity. These conservation 
laws are usually derived using ordinary vector calculus, and can seem rather 
complicated. From the geometric point of view, the existence of a flow suggests 
immediately the use of the Lie derivative, and we now show that the local con- 
servation laws become much more transparent when framed with Lie derivatives. 


5.16 The comoving time-derivative 
We have seen in exercise 4.22 that the equation of continuity, whose 
conventional form is 


0 _ 
5. + div(pV) = 
takes the form 
0 
+ . + ον (ρῷ) - 0, (5.68) 


where 6 == dx a dy a dz is the volume three-form of Euclidean space. The 
operator (0/d¢ + £77) is a natural time-derivative operator following a particular 
fluid element. To see this, think not of space but of the four-dimensional mani- 
fold called Galilean spacetime, whose coordinates are (x, y, z, t) (see $2.10). 
Any hypersurface ¢ = const is in fact Euclidean space. Then the motion of a fluid 
element describes a curve on spacetime, called the world-line of the element. In 
figure 5.3, two such world-lines (4A’ and BB’) are drawn. For an infinitesimal 
change in time dt, a point on this curve moves from the point with coordinates 
(x, ¥,Z, t) to the one with coordinates (x + V*dt, y + V*dt,z + V7dt, t+ df). 
If we call U the tangent to the world line in the four-dimensional manifold, then 
it clearly has components (V*, V”, V*, 1). The time-derivative following a fluid 
element is simply £z, the natural derivative along the world-line of the element. 


Fig. 5.3. Two moments of Galilean time and the world lines AA’ and 
BB' of two particles. The vector U is the tangent to AA’ parameterized 
by time f. 
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Exercise 5.18 
Using equation (2.7) show that 

_ 0 _ 
LgW = fa to W, (5.69) 
where W is any vector field in the hypersurface t = const, i.e. any 
purely spatial vector field (W‘ = 0). 


Equation (5.69) clearly holds if W is replaced by any (0) tensor which is entirely 
in the three-space t = const. It might seem that the notion of a tensor being 
purely spatial is not invariant under coordinate changes in the four-dimensional 
manifold, since it simply says that all the t-components of the tensor vanish. 
This is acceptable here, however, because of the rigid distinction made in non- 
relativistic physics between space and time. 


Exercise 5.19 
The most general kind of coordinate transformation which remains 


‘natural’ to the fiber-bundle structure of Galilean spacetime (§2.10) is 
t=); x! = fi’, d),7=1,2,3. (5.70) 
Show that under this transformation a (0) tensor A with no time- 
components (A(..., ὤ',.. .) Ξ 0) remains one with no time- 
components, and a (2) tensor B with no spatial components (i.e. only 
B, ,is nonzero) remains one with no spatial components. 


5.17 Equation of motion 

The condition that the flow be adiabatic means that the total entropy 
of a fluid element must be conserved. It is convenient to work with S, the 
specific entropy (entropy per unit mass). This must clearly be constant during 
the flow: 


ὃ 
+ (24 te) = 0. (5.71) 


The Euler equation of motion for a fluid whose pressure is p and which 
moves in a gravitational field whose potential is ® can be written in Cartesian 


coordinates as 
9 pig γι ? 54% © = 0, (5.72) 
Ot ox? 
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There are two reasons that this equation is valid only in Cartesian coordinates: 
first, some indices 7 are up and some are down, and only in an orthonormal basis 
does this make no difference; second, the term 0V'/dx/’ transforms like a (1) 
tensor only if the transformation matrix A’ j is independent of position (exercise 
4.5), which is true for a transformation from one Cartesian frame to another. 
The usual way to adapt it to arbitrary coordinates is to introduce the covariant 
derivative, which is defined in the chapter on Riemannian geometry. Here we 
show that there is a different, and very instructive, approach. First, note that the 
first two terms of (5.72) can be written as 

OMG, yi hi 

ot Ox? ” 
since there is no difference between V' and V; in Cartesian coordinates. (We use 
here, of course, the fact that the three-dimensional space has a metric tensor.) 
Next, replace the derivative V/0/8x! with the Lie derivative (equation (3.14)) of 
the one-form V=g|(V,_): 


(Ly V); . 


| 
= 
| 
~~ 
+- 
| 
κ. 


. O 1 ὃ 
- Yi—y+--— 
ox? § 2 dx’! 


(VV), 
where in obtaining the final expression we again used the fact that V; = Vi 
Therefore we find 


. 0 ο ὃ 
νο νι (νι πάν. (5.73) 


Both terms on the right-hand side are tensors in any coordinate system! There- 
fore (5.72) becomes the frame-independent expression 


0 ~ 1- ~ 

+ (2 ὃν V+—dp+ d(@®—34V’) = 0. (5.74) 
ρ 

In this the role of the metric is crucial but hidden: it is required to form V from 

V,and hence to form V? = V(V). 


5.18 Conservation of vorticity 

Now we are in a position to consider conservation of vorticity. In con- 
ventional terms, the vorticity is the curl of the velocity, V x V. As we saw in 
chapter 4, this is properly the exterior derivative dV. Now, exterior differen- 
tiation and Lie differentiation commute (and of course d and 3/dt commute 
since d only involves spatial derivatives), so we find from (5.74) 
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[e+ by Jay = + doa dp. (5.75) 
ot ρ 

(We have dropped tildes over symbols for clarity.) There are two cases to be con- 
sidered. The easier is when the fluid obeys an equation of state p = (0). Then 
dp a dp =0 and we find that the vorticity two-form dV obeys the local (or con- 
vective) conservation law 


ot 


This is the Helmholtz circulation theorem, written in its most natural form. A 
different result holds, however, if the more general equation of state p = p(p, S) 
obtains. Then the right-hand side of (5.75) does not vanish, but its wedge pro- 
duct with dS does: 


dSadpadp = 0. (5.77) 


5 + to dV = 0. . (5.76) 


Exercise 5.20 
Prove (5.77). 


The exterior derivative of (5.71) gives 


ot 
Therefore we can wedge dS with (5.75) to get 


ο ον dS = 0. (5.78) 


ὃ 
ἀδλ[-- £y| άν = 0, 
or 


ὂ 
9 (> ο] dSadV = 0. (5.79) 


This equation is the most general vorticity conservation law. It is called Ertel’s 
theorem. 

The meaning of the three-form dS Λ dV may not be immediately apparent, 
but it is possible to convert (5.79) into a conservation law for a scalar. The 
reason is that there is another conserved three-form, pw, and any two three- 
forms in a three-dimensional space are proportional. Therefore there is a scalar 
function a such that 


dSa dV = apw, (5.80) 
and (5.68) and (5.79) then give the scalar equation 
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ὃ 
—+£e]a = 0. 
It can be shown that, in conventional vector notation, 


1 
a = -νςο  σχγ. (5.81) 
p 


Exercise 5.21 
Prove (5.81). (Hint: express both sides of (5.80) in terms of dx a dy 
A dz.) 


In the notation introduced in chapter 4 we have 
1. 
a = --εὖ το Vj. (5.82) 
p 


Therefore α is the dual of dS ~ dV with respect to pw. The conservation of a is 
then a natural consequence of the conservation of dS a dV: the fact that ρω is 
conserved means that forming duals with respect to it is an operation which is 
also conserved, i.e. which commutes with the operator 0/dt + £7. 


Exercise 5.22 
The shear of a velocity field V is defined in Cartesian coordinates by 
the equation 


συ = Vig + Vii — 5649, (5.83) 
where @ is the expansion 

θς ν.γν. (5.84) 
Show that in an arbitrary coordinate system 

0 = he" £u8;;, (5.85) 
ση = Leg — Ίθει. (5.86) 


E Cosmology 


5.19 The cosmological principle 

Most physicists are aware that Einstein’s theory of general relativity has 
given modern physics a consistent and fruitful framework in which to study cos- 
mology, the large-scale structure of our universe. Most are also aware that, at 
least at the simplest level, there are only three basic cosmological models: the 
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‘closed’, ‘flat’, and. ‘open’ universes. What is probably less well known is that this 
simplicity of having only three models is not at all a prediction or consequence 
of Einstein’s equations. Rather, it is simply a consequence of assuming that the 
universe is homogeneous and isotropic in its large-scale properties. (Homogeneity 
and isotropy will be defined precisely below.) General relativity, like all the 
fundamental theories of physics, is a dynamical theory: given initial conditions, 
it will predict their future evolution and past history. The uniformity of the uni- 
verse is part of the initial conditions we put in to construct the simplest models. 
The important contribution of general relativity is that it permits us to choose 
the geometry of space — its metric tensor field — as a part of the initial con- 
ditions. This is not possible in Newtonian gravity, of course. Once we decide to 
choose the most uniform initial conditions, it is differential geometry that tells 
us that only three metric tensor fields are possible. Our aim in the next few 
sections is to find these metrics. We shall use the mathematics of symmetry and 
invariance developed in chapter 3, but we will not need to know anything about 
general relativity nor even about Riemannian geometry. 

We begin with the physical problem: the universe. On a small scale the uni- 
verse is certainly lumpy. On nearly any length scale from the nuclear (10:15 m) 
to the interstellar (10171), our world is characterized by clumping of matter 
into small regions with sharp demarcations between different kinds of matter or 
between matter and the vacuum. The stars themselves group into more or less 
isolated galaxies, galaxies congregate into clusters of several tens to thousands, 
and even clusters may associate in loose superclusters. But modern astronomy 
can see well beyond the supercluster length scale, and we find that in all direc- 
tions the tendency is for greater and greater homogeneity in the properties of 
the universe when they are averaged over larger and larger length scales. Since it 
is these large-scale averaged properties (particularly the mean density and 


Fig. 5.4. A slice of spacetime showing all the events labelled by coordi- 
nates ¢ (time) and x, with y = z = 0. Because electromagnetic radiation 
travels at a finite speed, distant objects are seen at an earlier time in 
their own histories than nearby objects. 
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velocity) that are important for the dynamics of the universe, the cosmologist 
would like to incorporate this homogeneity into at least the simplest models. 
But what does homogeneity really mean? After all, in a dynamical universe, the 
more distant regions should look different from those nearby if only because 
they are seen at an earlier time in their history, as illustrated in figure (5.4). 
Indeed this is the case: the number of quasars, for instance, is much higher in 
distant regions than locally. The homogeneity one ‘observes’ is really an extra- 
polation to the present time of the condition of distant regions. Yet in relativity 
even ‘the present time’ is not an absolute concept. We cannot give a full dis- 
cussion of these problems here, but we can say how they are resolved. 

The basic idea is to split spacetime up into a family of three-dimensional 
spacelike submanifolds filling it up (a foliation). These are called hypersurfaces 
of constant time (see figure 5.5). This really amounts just to a choice of time- 
coordinate. The metric tensor g| of spacetime has, like any (°) tensor, a natural 
restriction to each hypersurface, and the hypersurface is space-like if g| is 
positive-definite on all vectors tangent to it. The ‘uniformity’ of the cosmology 
depends on the Killing vectors or isometries of these hypersurfaces. 

Let G be the Lie group of isometries of some manifold S' with metric tensor 
field g|. The Lie algebra of G is that of the Killing vector fields of g|. Elements 
of G are mappings of S onto itself (diffeomorphisms). The action of G on S is 
said to be transitive on S if, for any two points P and Q of S, there is some 
element g of G for which g(P) = Q, i.e. which maps P to Q. The manifold S is 
said to be homogeneous if its isometry group acts transitively on it (see figure 
5.6). What this means is just that the geometry is the same everywhere in S. 

Suppose there are elements of G which leave some point P of S fixed. Then 
the product of any two also leaves P fixed, and since the identity e is one of 
them, they form a subgroup Hp of G called the isotropy group of P. These are, 
of course, the familiar rotations about an axis through P. The isotropy group of 


Fig. 5.5. Slicing spacetime into spaces of constant time 1. 
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P keeps P fixed and therefore maps any curve through P to another curve 
through P (see figure 5.7). It consequently induces a map of tangent vectors at P 
to others at P: a map Τρ > Tp. This group of mappings is the linear isotropy 
group of P. (Recall the similar discussion of the adjoint representation of a Lie 
group, 53.17.) A manifold S of dimension m is said to be isotropic about P if 

its isotropy group Hp is just SO(m), the group of rotations about arbitrary axes 
through P. If S is isotropic about every point P it is said to be isotropic. 

A cosmological model M is said to be a homogeneous cosmology if it has a 
foliation of space-like hypersurfaces, each of which is homogeneous; and 
similarly for an isotropic cosmology. As discussed above, the evidence is strong 
that our universe is homogeneous, at least on large scales in our observable neigh- 
borhood. We also see no systematic variations in its structure in different direc- 
tions in the sky. This suggests the universe is isotropic about us. But modern 
science does not like to assume that we live in a particularly favorable location in 
the universe. This is often elevated to the status of a principle, variously known 
as the cosmoiogical principle, the Copernican principle, or the principle of 
mediocrity: the properties of the universe we see near us would be seen, on aver- 
age, by any observer anywhere else in the universe. This principle enables cos- 
mologists, in the absence of information to the contrary, to extend our local 


Fig. 5.6. Some neighborhood U of P is mapped by g onto a neighbor- 
hood V of 0 = g(P) isometrically: there is no difference in the geometry 
near P from that near QO. 


Iz 


Fig. 5.7. The isotropy group of P maps Tp > 7p by mapping curves 
through P to other curves. 
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homogeneity and isotropy to the whole universe. Thisis not necessary, of course, 
and much current research is devoted to exploring inhomogeneous and/or aniso- 
tropic cosmologies. But the three basic models are the only three which have 
homogeneous, isotropic three-spaces. This is what we shall now prove. 


Exercise 5.23 

As we know from §3.9, the Killing vectors of the sphere S? are the 
vectors /,., ly, |... These form a basis for the Lie algebra of the group of 
isometries of S*, SO(3). Prove that S? is a homogeneous and isotropic 
manifold. 


5.20 Lie algebra of maximal symmetry 

We shall begin by studying the Killing vector fields of a three- 
dimensional manifold S. If £ is a Killing vector, its components in any coordinate 
system satisfy the equations 


(£2); = fg + EF gn +" gi, = 0. (5.87) 
It will be more convenient to use the components of the one-form gi(é,_), 

ἐκ = Brit. (5.88) 
These satisfy the equivalent equations 

E+ 77 261" = 0, (5.89) 
with the definition 

Γ, — ο (Si j + Emj,i — 8ij.m): (5.90) 


(The definition of I, including its factor of +, is conventional and would make 
more sense after a reading of chapter 6. For us equation (5.90) simply defines a 
convenient shorthand notation.) 

Equation (5.89) is symmetric under exchange of i and /, so it represents in n 
dimensions ΣΗ(ή + 1) independent differential equations, six for n = 3. Since 
there are only three components of E to solve for, the system is overdetermined: 
a general metric tensor Ο| has no Killing vectors. Our object is to find what form 
Q| must take in order that it allow the maximum number of Killing vectors. To 
see what this maximum number is, we differentiate (5.89) to get 


Evin + Ein = 2ET y) p- (5.91) 


By adding (5.91) to itself with the index permutation (i > k,j >i, k >/) and 
subtracting the permutation (i >j,j >k,k > i) we arrive at the equation 


Ei jr = Hijet, + Kijn’ £1, m: (5.92) 
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where Hijp is a complicated function of g;; and its first and second derivatives, 
and K wR” similarly depends on g;; and its first derivatives. The key point about 
(5.92) is that if we know é; and &; ; at any point P and if we know g;; every- 
where, then we can determine &; μι at P from (5.92), and similarly all its higher 
derivatives at P by successively differentiating (5.92). On an analytic manifold 
(which we shall assume) this suffices to determine the vector field ἕ everywhere. 
Moreover, we know that £; at P determines the symmetric part of &; ; at P by 
equation (5.89). If follows that every Killing vector field on ὁ is determined 
completely by giving the values of 

τι = &(P)and Ay  ἕμῃ(Ώ) (5.93) 
at any point P of S. It is important that a choice of {n;, A;;} at P does not necess- 
arily determine a Killing vector, because it may happen that (5.92) has no 
solutions: its right-hand side may not be symmetric under exchange of [ and k. 
But the argument does show that there cannot be more Killing vectors than the 
number of independent choices of {n;,A,;}, which in m dimensions is 

m+im(m—1) = $m(m + 1). (5.94) 
by virtue of (5.93). A manifold is said to be maximally symmetric if it has the 
maximum number of Killing vector fields. 

It is easy to show that a maximally symmetric connected manifold S is hom- 
ogeneous. At any point P we can choose a Killing vector field having any tangent 
at P. The one-parameter subgroups associated with these Killing vectors can 
therefore map P to any point Q in some neighborhood U of P (see figure 5.8). 
By a succession of such maps we can clearly map P to any point in S whatever. It 
follows that the isometry group maps P to any point, and S is homogeneous. 

Next we take a look at the isotropy group of P. Such transformations leave P 
fixed, so the associated Killing vector fields vanish at P. The Lie bracket of any 
two Killing fields V and W is 


i ei a τμ γὶ 
[V,w]|' = Vv" SW W' iV ; 
Fig. 5.8. By choosing the appropriate one-parameter subgroup of the 


isometry group one can map P to any point Q or Q’ in a neighborhood 
Ό. 
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~ [VW]; = Τι —Wi iV! — in, (V"W! — WED’). (5.95) 
If V and W both vanish at P, then so does [V, W]. But [V, W] is a linear combi- 
nation of Killing vector fields, so for it to vanish at P it must be a linear combi- 
nation only of those fields which also vanish at P. So these fields form a Lie sub- 
algebra, clearly the algebra of the isotropy group at P. The next exercise shows 
that the isotropy group is SO(m) if S is space-like, i.e. that a maximally sym- 
metric space-like manifold is isotropic. 


Exercise 5.24 
Choose at P the sort of coordinate system permitted by exercise 2.14, 
in which for a space-like manifold g;,(P) = 6,; and g;; ,(P) = 0. 

(a) Show that near P an isotropy Killing vector field is given by 


Vi = Aix! + O(x?), (5.96) 
where A‘, is an arbitrary antisymmetric matrix 
Ai, = —A’,. (5.97) 


(b) Let W be another isotropy Killing vector field, 
Wi = Bix! + O(x?), 
and show that 
[V,W]' = [A,B]',x’ + Ο(«2). (5.98) 
where [A, B]'; denotes the elements of the matrix commutator of A’; 
and B’;. This shows that the Lie algebra of the isotropy group is the 
same as the Lie algebra of SO(m). 

(c) Argue from this that the isotropy group of P is SO(m). 

(d) Show that if g| is not positive-definite (or negative-definite) then the 
isotropy group is not SO(m). In particular show that the isotropy group 
of a point P in four-dimensional Minkowski space is the Lorentz group 
L(4). 


5.21 The metric of a spherically symmetric three-space 

Now we restrict our attention to space-like three-manifolds. The iso- 
tropy group is SO(3) and we say the manifold is spherically symmetric about 
any point. In this section we construct a convenient coordinate system for the 
rest of our calculation. We know that the Killing vectors of SO(3) define spheres 
S? by their integral curves. Since every point is on one such sphere, they must 
foliate the manifold S. We will adopt spherical coordinates, with the usual ϐ and 
@ on each sphere and a third ‘radial’ coordinate labelling spheres. There is a 
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particularly convenient choice for the radial coordinate. The metric of S induces 
a metric tensor on each sphere, which in turn defines a volume two-form and a 
total area (integral of the volume two-form). We define ihe radial coordinate r 
of a sphere by the equation 


area = 4nr’, r = (area/4n)"”. (5.99) 


This intrinsically defined coordinate need not be monotonically increasing every- 
where, as figure 5.9 shows. But at least in some neighborhood of P it is guaran- 
teed to be good by the local flatness theorem, exercise 2.14. (It is singular at 

r = 0, of course, but we know how to handle that.) 

In addition to the radial coordinate we have to define ϐ and ¢ more precisely. 
We have placed ϐ and ¢ on each sphere but we have not said how the pole 0 = 0 
of one sphere is related to that of another. That is, we are free to slide the 
coordinates of a sphere around as we move from one to another. We fix the pole 
in the following manner. At every point Ο there is a vector n orthogonal to the 
sphere at that point (g|(z, V) = 0 for any V in Tg(S*)), normalized to unity 
(α((π. 2) = 1), and pointing away from P (which is well defined near P and 
extends to all of S by continuity). This vector field is called the unit normal 
vector field, and is C® except at P. Choose the pole of any particular S* arbi- 
trarily and then fix the poles of all the others by demanding they lie on the 
integral curve of 7 through the original pole. This is illustrated in figure 5.10. 
This clearly will imply that any integral curve of n is a curve of constant ϐ and ¢, 
or in other words a coordinate line of the radial coordinate. Since 0/00 and 0/0¢ 
are tangent to the spheres this construction implies 


2-6 = g\(d/dr, 0/00) = 0. (5.100a) 


Fig. 5.9. A radial coordinate labelling circles on a sphere, defined as the 
circumference + 27. This is the two-dimensional analogue of the situ- 
ation described in the text. The radial coordinate increases away from P 
at first (say from A to B) but begins decreasing (from C to D) and 
becomes zero at P’. 
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δν = GI(0/dr, 9/9Φ) = 0. (5.100b) 
Moreover, on each sphere the metric is that of the unit sphere times r’, the 
appropriate factor to make the area be 4nr’: 

Soo = 1", Sep = 0, Sop = 1’ sin’. (5.100c) 


We therefore have only one unknown metric component, g,.. 


Exercise 5.25 
(a) Define the radial distance from P to a sphere with coordinate 7 to be 


the integral 


r 
[ αμ" ταν (5.101) 
0 
along a line ϐ = const, ¢ = const. Argue that g,, must be independent 
of 6 and ¢. 
(b) Show from exercise 2.14 that as one approaches P, 
lim g,, = 1. (5.102) 
r—0 


By exercise 5.25(a) we write g,.,. = f(r) and have the metric 
fr) 0 0 
(gj) =| ο Fr’? 0 (5.103) 
0 ο r?*sin?6 
As we have used only the isotropy group of P to get this, we should not expect 
to be able to determine f(r). For that we must use the rest of the isometries of S. 


Fig. 5.10. Establishing the pole of each circle of constant r in figure 5.9 
by requiring them all to lie on a single integral curve of the unit normal 
field Π. 
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5.22 Construction of the six Killing vectors 
There are a number of methods we could use to find the form of f(r) 
that guarantees the homogeneity of S. The method we shall use is to construct 
all the Killing vector fields of S by using the vector spherical harmonics of $4.29. 
Any vector field V on S can be written in the form 


_ ὃ —, .. 
V= Enm(1) Yim δν + Mm” Yim + Sim“) Σπα, (5.104) 


with an implied summation on / and m here and wherever they are repeated in 
the same term. We shall need the components of this equation. It is easy to 
deduce from equation (4.101) that 


(Yin) = ιο (Yin)? = πρ Limo: (5.105a) 
(Yin) = -- Yim,os (Yim)? πρ ἔιπιθ. (5.105b) 
sin 0 sin 0 ° 
It follows that 
V" = Τμ. (5.106a) 
V? = mY imo + SimYimo/sin 4, (5.106b) 
V? = mY im,o/sin?@ — SimYim,o/sin θ. (5.106c) 
These components have to satisfy Killing’s equation 
Κυ = τει + V" Spi + V™ Sin = 0, (5.107) 


with ση from (5.103). 

The three equations {Kog = 0, Keg = 0, Κφφ = 0) do not involve derivatives 
Of Ems Nims Οἵ Sim, 50 we Shall tackle them first. First consider the combination 
(indices raised with (5.103)) 


4 
0 = K°*, +K%y = 7 slm¥im + 2nimL*(Y im), 


where 1, is the operator defined by equation (3.33). Using (3.33) we get 


[2/NEim . KI + 1)nim| Yim = 0. 
By the linear independence of the spherical harmonics we have 





2 
> Em ~Kl+ 1ληιμι = 0. (5.108) 
Next consider the combinations 
ο = 3(K%s —K%) = Fimtim + GimSims (5.109a) 
1 
ο-- 2 Κρφ — — Gimtim + FimSim> (5.109b) 
r“ sin 8 


where F’,,, and G,,, are abbreviations for the expressions 


Fim = Yim,90 — cot 6 Yim,0 . μα φφ/ἱΠ76, 
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Gim = 2Yim.og/sin 8 — 2 cot OY}, 4/sin θ. 
Equations (5.109) have the solution [), = Nj = 0 unless the determinant of 
their coefficients vanishes. But this is (F;,,)* + (1η). 5ο it vanishes only if 
both F),, and G,,, vanish. It is easy to work out that this happens for / = 0 and 
= 1 (any m) but not for / > 2. Moreover, it is obvious from (5.106) that / = 0 
does not have a contribution from 7 or ¢ (the fixed-point theorem for S* again!) 
so that we can conclude 


1 = 1: Mim, Sim arbitrary; 


(5.110) 
1>2: 11m = Sim = 0. 
Then (5.108) gives us 
l= 0: & = 0, 
1= 1: ἔτι = 1m (5.111) 
| > 2: &,, = 0. 


Now we turn to the other three equations in (5.107). The first is a scalar with 
respect to rotations: 
0 = Kyy = (2féimr + f E1m)Yim: 
which implies 
fétmst tf réim = 0. (5.112) 


The remaining two equations, K,g = K,g = 0, transform as a vector under 
rotations. The divergence of this vector (with respect to the volume of S*) is 


1 
ο = (sin 0K,”) 9 + (sin6K,°) 4 = frm. + “Fim sin6L*(Yim), 
r 
which again implies (for 1 > 0) 
] 
Πιτ +S E1m = 0. (5.113) 


The remaining equation can be taken to be the divergence of the dual of the 
vector in S?, 

0 = ἆγθφ — Kro,6 = η Simr sin OL7(Yim), 
which of course implies 

Sim, = 0. (5.114) 

We may conclude that {δημ 77 = — 1, 0, 1} are three arbitrary constants, the 

only contribution from Yj,,. The three equations (5.111) for the unknowns £,,,, 
Nim, and f have the following solution in terms of the arbitrary constants K and 


Vin: 
f = (—Kr’y’, (5.115) 


Eun, = V,,(1 —Kr*), (5.116) 
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1 
Tim = — m(l --Κνλ)"3. (5.117) 


Exercise 5.26 
Verify equations (5.105), (5.108), (5.109), (5.112}, (5.113), (5.114), 
and (5.115-17). 


Exercise 5.27 
Show that the Killing vectors with V,, = 0 are those corresponding to 
the isotropy group of the origin r = 0. 


Exercise 5.28 
Show that the apparent singularity in 7,,, as r > O isa coordinate effect: 
the vector field is well-behaved at the origin. 


Exercise 5.29 

Set K = 0 in (5.115-17) and show that S is just E*, Euclidean space. 
Find the constants V,,, that define the Killing vectors {0/d0x, d/dy, 9/97). 
where the Cartesian coordinates are obtained from our polars in the 
usual way. 


5.23 Open, closed, and flat universes 
We now have a complete description of the geometry of the hom- 
ogeneous and isotropic spaces of the cosmological model: they have the metric 


tensor 
(1—Kr*y' 0O 0 
(6η) = 0 r? O |. (5.118) 
0 0 r?* εἰπ2θ 


It only remains to try to get a picture of this geometry. The following coordi- 
nate transformations are a help. 


Exercise 5.30 
Find a coordinate transformation from r to x which produces the 
following metric components 


for K >0: 
1 0 0 
(gi) = k 0 sin?x 0 (5.119a) 


0 0 sin?y sin? 
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for K <0: 
| 1 0 0 
(g;;) =i 0 sinh*x 0 |. (5.119b) 


0 0 sinh?x sin? 


This shows that the geometry really depends only on the sign of K. Its magni- 
tude serves only as an overall scale factor. 

In the case K > 0, the sphere of radial coordinate χ has area 47 sin?y/K, which 
increases away from x = 0 toa maximum at y = 7/2 and then decreases to zero 
at x = 7. This is reminiscent of S* (figure 5.9). In fact, this is the metric of the 
sphere S° of radius K ~‘/?. Because the space is finite, the universe is said to be 
closed. 


Exercise 5.31 

Find a coordinate transformation of E* from Cartesian coordinates {x’} 
= {w, x,y, 2) to spherical coordinates fF} = tr, x, 9, ¢} in which the 
metric g;; = 6,; has the components g;';' given by (5.119a) when 
restricted to the sphere S°,w2 +x? + y? +27 =K7T. 


The case K = 0 has been considered in exercise 5.29. It is the flat universe. 

The case K < 0 is the open universe, and it is the hardest to visualize. The sur- 
face area of a sphere of radial coordinate y is 47 sinh?x/|K|, and increases ever 
more rapidly with y. This universe is unbounded. 


Exercise 5.32 

(a) By considering the relation between the areas of spheres y = const and 
the distance of the sphere from the origin y = 0, equation (5.101), 
prove that the metric (5.119b) is not the restriction of the Euclidean 
metric to any submanifold of any E”. 

(b) Find a submanifold of Minkowski space whose metric is that of 
(5.119b). 


When Einstein’s equations are supplied with initial data which are homo- 
geneous and isotropic (and this includes not only the geometry but the matter 
variables as well), then the subsequent evolution of the universe maintains the 
symmetry. It follows that the only aspect of the geometry which can change 
with time is the scale factor K: the universe gets ‘larger’ or ‘smaller’ as time goes 
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on. One must be careful, however, not to make coordinate-dependent state- 
ments. For the closed universe, whose total volume is finite, the change in K 
does cause a change in the total volume. But the flat and open universes are both 
infinite, so it is not meaningful to talk about their total volume. What general 
relativity tells us is that the coordinates of equation (5.119) are ‘comoving’: the 
local mean rest frame of the galaxies in any small region of the universe stays at 
constant {x, 0, ¢} as time evolves. It follows then that a change in K produces a 
change in the distance between galaxies, and this is what is meant by an expand- 
ing universe. In the ‘standard model’ of the universe, which assumes homogeneity 
and isotropy and a few other things, all three kinds of universe begin with zero 
‘volume’ (K = 9ο) and expand away from this ‘big bang’. The closed universe 
expands to a maximum and recollapses, the flat universe expands at a rate which 
goes asymptotically to zero, and the open universe expands at a rate which goes 
asymptotically to a nonzero limit. All of these things are consequences of 
Einstein’s equations. To understand these equations it is necessary to add one 
more level of structure to our manifolds: the affine connection. This is the 
subject of chapter 6. 
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6 CONNECTIONS FOR RIEMANNIAN MANIFOLDS 
AND GAUGE THEORIES 


6.1 Introduction 
The subject of this chapter is outside the main theme of this book, 

which is the study of the differential structure of the manifold. The affine con- 
nection is an additional piece of structure which gives shape and curvature to a 
manifold; it does not arise naturally from the differential structure, nor is it even 
a tensor. For this reason, the chapter is marked as supplementary. Nevertheless, 
no treatment of differential geometry for physicists would be complete without 
this important and very topical subject. Connections are finding increasing popu- 
larity in physics, particularly in gauge theories in elementary particle physics. We 
shall mainly discuss affine connections (Riemannian manifolds), reserving an 
introductory section on gauge connections for the end. 

In earlier chapters we have occasionally added extra structure to a manifold, 
in that we have singled out a particular tensor field as special, either to serve as 
a volume-element or as a metric. Volume-elements are not far removed from the 
differential structure of the manifold. The metric, on the other hand, creates 
even more structure than the affine connection, as we shall see below. But we 
have been able to avoid all that in our applications, only using the metric in its 
role as a mapping between (4/) tensors and (γι) tensors. The affine connection 
cannot be fitted into the structures we have already developed. From the point 
of view of the differential structure, it is a radical new addition to the manifold, 
and it has correspondingly rich possibilities for physical application. 


6.2 Parallelism on curved surfaces 

We have repeatedly emphasized that on a differentiable manifold there 
is no intrinsic notion of parallelism between vectors defined at different points. 
The affine connection is a rule whereby some notion of parallelism can be 
defined. To anticipate what kind of a rule may be possible, let us consider the 
notion of parallelism on an ordinary curved two-surface, the sphere. In figure 
6.1, the vector V is the tangent to the great circle ABC at the north pole, point 
A. Suppose we carry, or transport, V along ABC to the south pole, C. In order to 
be defineable in two-dimensional terms it must be Κερί tangent to the sphere, so 
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if we do not rotate it as we carry it, it will simply remain tangent to the curve 
ABC. It winds up as V’ at C, pointing in what, to us three-dimensional beings, 
looks like the direction antiparallel to V. Should we assume that, at least with 
respect to the sphere’s geometry, V and V’ are parallel? Before jumping to a con- 
clusion, suppose we transport V from A to C on the path ADC shown in figure 
6.2, where ADC is another great circle intersecting ABC at right angles at both 
poles. Since V starts out perpendicular to ADC, the natural way to move it with- 
out twisting is to keep it perpendicular to ADC and tangent to the sphere. This 
produces the vector V" at C, which, to us, is in fact parallel to V. But V" and 
V',, both vectors at C, are antiparallel! Which is parallel to V2 Clearly, if we 
simply consider the intrinsic properties of the sphere, neither vector deserves to 
be called parallel to V. There is no global notion of parallelism. All one can do 

— and this is what we have done — is to define a notion of parallel transport, of 
moving the vector along a curve without changing its direction. The affine con- 
nection is a rule for parallel transport. 


Fig. 6.1. Parallel transport of a vector V along a great circle of the 
sphere. 





C 


Fig. 6.2. An alternative path for parallel transport, with a different 
result. 
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6.3 The covariant derivative 
We shall for the moment view the affine connection in an abstract sense; 
it will become more concrete when we introduce components in the next section. 
For now, suppose we have a curve @ and a connection, a rule for parallel trans- 
port. Let the tangent to & be U = ἀ[ἀλ. At the point P, pick an arbitrary vector 
V from Tp. Then the connection allows us to define a vector field V along the 
curve &, which is obtained by parallel-transporting V (see figure 6.3). Since we 
can now say that V does not change along Y, we can define a derivative with 
respect to which V has zero rate of change. This is called the covariant derivative 
along U, Vg, and we write 
+ VaV = 0 © Vis parallel-transported along 6. (6.1) 
If W is a vector field defined everywhere on &, we can define its covariant deriv- 
ative along & in much the same way as we did for Lie derivatives (see figure 6.4). 
To define VW at P, it will be convenient to express all vectors as functions of λ. 
If P has parameter value Xo, then we define the field Wx, .-(A) to be that parallel- 
transported field (VgW* = 0) which equals W at λο + ε. The vector WX .<(Ao) is 
the vector W(Ao + ε) parallel-transported back to Xo. Then the derivative may be 
evaluated entirely in the vector space Tp: 
Wy +6(Ao) _ W(Ao) 
im σσ, 


E70 € 


(νο), = (6.2) 


Although this procedure resembles the one we used for defining the Lie deriv- 
ative, it is important to understand the significant difference: ‘dragging back’ a 


Fig. 6.3. The affine connection permits us to define V(Q) for any point 
O on © by parallel transport from P. 





€ 


Fig. 6.4. The vector field W on & is not parallel transported. Compari- 
son with one which is permits a definition of the covariant derivative of 
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vector for the Lie derivative required the entire congruence, so that U and W had 
to be defined in a neighborhood of the curve 6; parallel-transport, by contrast, 
requires only the curve @, the fields U and W on the curve, and of course the 
connection on the curve. 
It is clear from (6.2) that Vg is a differential operator: 
Va(fW) = fVoW+ Wor 
—_ _d 
= fVowWt+Ww of , (6.3a) 
dav 

where the last step is the obvious extension to scalars. The covariant derivative 
can also be extended to tensors of arbitrary type by the Leibniz rules 


+ Vo(A @B) = (VGA) OB +A (ΝΕ), (6.3b) 
+ νο(ῶ, 4) = (Vp, A) + (8, Ve). (6.3c) 
Equations (6.3) guarantee compatibility of the connection with the differential 
structure. 

Suppose that we were to change the parameter along our curve from A to µ. 
Then the new tangent would be gU, where g = ἀλ/άμ. From (6.2) it is clear that 
the covariant derivative would also be multiplied by g, since ε would be replaced 
by du = εάμ/ἀλ while Wii +5 µ(μο) is the same as Wx κεζλο). (This is, strictly 
speaking, part of the definition of what we mean by a connection: the notion of 
parallel-transport along a curve must be independent of the parameter on the 
curve.) Therefore we conclude that for any function g 

VW = ενσή. (6.4a) 
We must also put another restriction on the affine connection, which is that at a 
point the covariant derivatives in different directions should have the additive 
property 

(VoW)p + (VoW)p = (Vou vW)p. (6.4b) 
This makes V behave like the ordinary V of Euclidean vector calculus. Together 
(6.4a, b) imply that for any vector fields U, V, W and functions f, g we have 


+ VegeevW = fVoW + ew. (6.4c) 


Exercise 6.1 
Show that (6.4c) and the fact that VW is a vector imply that VW is a 
(1) tensor field whose value on arguments U and @ is 


VW; U) = (ῶ, VgW). (6.5) 
This tensor is called the gradient of W. 
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The fact that VW is a tensor field means that we have been able to remove the 
curve entirely from the definition of the covariant derivative. The tensor VW is 
defined only by W and the connection. One might be tempted to go further and 
say that Vis itself a ({) tensor field which is just the connection; but this would 
be wrong. While V may symbolize the connection, it is not a tensor field, since 
VW fW) #f VW (cf. (6.3a)). For this reason, the connection cannot be regarded 
as a tensor field. 


6.4 Components: covariant derivatives of the basis 

Since any tensor can be expressed as a linear combination of basis 
tensors, and these basis tensors are all derivable from the vector basis {e,}, the 
connection can be completely described by giving the gradients of the basis 
vectors. So we define 


+ Vee; = yep. (6.6) 
The functions I, are called Christoffel symbols. For fixed (i, /) Γι is the kth 
component of the vector field Vz,e;. Note carefully the order of the indices on 


I’: the one associated with the derivative goes last. We shall often use the short- 


hand 
Ve, = νι. (6.7) 


In an n-dimensional manifold, the n° functions ρα, completely determine the 
affine connection, and this is often the most convenient way of describing the 
connection. Notice that arr is not a component of a tensor; under a basis trans- 
formation the indices Κ and i transform like tensor indices (by (6.5)) but the 
index j does not (by (6.3a)). 


Exercise 6.2 
Show that 


Dey = ARRAY AGT: + AY AWAY), 

where by V,A*; we mean dA*;/dA;, in which 6 = d/dd; and A", is 
treated as a function on the integral curves of e;. 

Exercise 6.3 

Show, by exercise 6.1, that 

[Γόμὲε 8 ὤ] 

is a collection of (1) tensors. (Here {G3'} is the one-form basis dual to 
(ej) 

Exercise 6.4 

On the unit sphere the usual spherical coordinates @ and ¢ define the 
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basis {@p = 0/00, €y = 0/0¢}. Extend the reasoning used in §6.2 to 
deduce that 
I” 49 = — sin ™ cos 0, 1 94 = r 46 = cot 6, 


and all other I's vanish. (Ν.Ο. this is a difficult problem. You should 
make maximum use of the symmetry of the sphere, and make intelligent 
guesses about how vectors behave under parallel transport.) 


Exercise 6.5 
From (6.6) and (6.3c) deduce that 
4 Vo" = — Tj”. (6.8) 


Now that we have the derivatives of basis vectors, we can find the derivatives 
of arbitrary tensors. For example, if U = d/dd then 


VaV = U'Vs(V'é;) 
= U'(Vs,V)e; + U'V'Vs.@}. 
In the first term, V’ is simply a function, so U'V,(V’) = dV"/dX. Therefore we 
have 


. dvi. 
νο = D 4 + UV'T" €, 
Vv _ 
= κ. rar Gj. 


To get the final expression we had to redefine some summation indices in the 
final term. Since VV is itself a tensor, it has components 

(WY; = VV’) + DV". 
A word about the term V,(V’). If é; is the vector d/dy, then V,(V’) = dV4/du, 


with V/ simply a function along the curve whose parameter is µ. If δι is a coordi- 
nate basis vector then 6; = 0/dx' and we have 


VV? = ay = V i, 


using the comma notation introduced for differential forms. It is customary even 
where e; is not a coordinate basis vector to use the comma notation: 

Ve,f = &Lf] =f: (6.9) 
on any function f. When e; is a coordinate basis vector, this is the usual partial 


derivative; when e; is not, then this is simply the derivative of f along e;. We can 
thus write 


+ (Wy, = νετ = Vy. (6.10) 
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We have here introduced the semicolon notation for the covariant derivative. 
Whereas neither V? ; nor Τζι V™ transforms like a tensor, their sum clearly 
does. 


Exercise 6.6 
Show that if @ is a one-form 


(V6); ξωιῃ = Wij I" ;wp. (6.11) 
Exercise 6.7 
Show that if T is a (47) tensor, 
ο... 
πο... 
_ ey νν κ. (6.12) 


6.5 Torsion 

The two quantities [U, V] and Vo V — VoU are both vector fields and 
are both antisymmetric in U and V. A connection is said to be symmetric if they 
are equal: 
+ VoV—VoU = [U, V] & symmetric connection. (6.13) 
The name ‘symmetric’ is used because of the property proved in the following 
exercise. 


Exercise 6.8 
Show that in a coordinate basis, (6.13) implies that a connection is 
symmetric if and only if 


¢ ,, = Γή (6.14) 


For a nonsymmetric connection we define the torsion T";;: 
-- - _- = - k - 
¢ Ve, — Νσιέι — [6]. 6ι] = Τ ji€kr- (6.1 5) 


μα 


Exercise 6.9 
Show that {7",;} are the components of a (2) tensor, which we call the 
torsion tensor T: 


Va V —VeU —-[U, V] = TC ;U,Y). 


The empty slot in T is for a one-form argument. 
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Exercise 6.10 

Suppose a manifold has two connections defined on it, with Christoffel 
symbols I™,; and I'’",,. Show that 

DY, = Γη Γη 

are the components of a (4) tensor. Show that the tensor D is sym- 


metric in its vector arguments if and only if both connections have the 
same torsion tensor. 


Exercises 6.9 and 6.10 show that we can always define the symmetric part Vs) 
of any connection V by defining the Christoffel symbols 

Γου = My — aT y- 
While torsion is in principle a useful part of the connection, it has not had as 
much popularity as the symmetric part in constructing mathematical models for 
physical laws. From now on we will deal with symmetric connections unless 
otherwise specified. One reason for this will be apparent in exercise 6.18 below. 
Notice that the definition (6.13) immediately guarantees the following. 


Exercise 6.11 
A manifold has a symmetric connection. Show that in any expression 
for the components of the Lie derivative of a tensor, all commas can be 
replaced by semicolons. An example: 
(£76); = ωι 0) + ωιῦ;; 

= «;.jU? + ωιῦ] α. 


(Naturally, all commas must be changed, not just some.) 


6.6 Geodesics 


A geodesic curve is a curve that parallel-transports its own tangent 
vector. The geodesic equation is 


+ VaU = ο. (6.16a) 
If \ is the parameter of the curve and {x7} is any coordinate system, this becomes 
i 
at I;,U'U" = ο, (6.16b) 
or 
αχ ο dx! dx* 


otra a 7 ore 
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The last equation is a quasi-linear system of differential equations for x'(A), the 
equation of the curve. 


Exercise 6.12 
Recall that our definition of a curve includes its parameter. If λ is a 
parameter for which (6.16c) is true, show that a change of parameter to 


w= artd, (6.17) 


where a and b are constants, also gives a solution to (6.16c). The param- 
eter of a geodesic curve is called an affine parameter. 


Notice that only the symmetric part of a connection contributes to the geodesic 
equation. This provides a way of displaying the geometrical effect of torsion. 
Take a geodesic through a point P with tangent vector U. In Τρ choose a linear 
subspace δρ of dimension n — 1 (the manifold’s dimension being n) which is 
linearly independent of U. Pick a vector ἔ in Rp and construct a geodesic 
through P tangent to £. Using the symmetric part of the connection, parallel- 
transport U along £ a small affine parameter distance e. Construct a new geodesic 
through this new point tangent to U there (see figure 6.5). This geodesic will be 
roughly parallel to the first one. In this manner, any point in the neighborhood 
of P can be given a geodesic ‘parallel’ to U. Along this congruence of geodesics 
we can transport the original ‘linking’ vector £ in two ways, either by parallel- 
transport or by Lie dragging. Let £ be parallel-transported. Then by (6.15) we 
have 

(Lg) = —(VgU)'— ΤΕ”. 
The initial vector £, however, had the property that VigyzU = 0, so that (Vz Uy 
= ΣΤΙΣ initially. Therefore we have the initial value 


(458)! = —2T',8U". 


Fig. 6.5. Two parallel geodesics U and U’ and the vector ἕ which con- 
nects them in the plane Ap and is parallel-transported along U. If there 
is torsion it rotates away from U’. 
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What this means is that a vector ἕ parallel-transported by a symmetric connection 
would stay ‘attached’ to the parallel congruence of geodesics we have con- 
structed. But if the connection is not symmetric, the vector does not remain 
fixed in this congruence. Speaking loosely, it is ‘rotated’ relative to nearby geo- 
desics by the action of torsion. Conversely, if we regard the parallel-transported 
vector ἕ as defining a ‘fixed’ direction as it moves along, then the congruence of 
‘parallel’ geodesics twists around the one carrying £. (One cannot, however, 
define precisely the notions of ‘rotation’ and ‘twist’ without a metric.) 


6.7 Normal coordinates 

It will be helpful below to use a coordinate system based on geodesics. 
To construct this, we note that the geodesic curves through a point P give a 1-1 
mapping of a neighborhood of P onto a neighborhood of the origin of Tp. This 
map arises because each element of Tp defines a unique geodesic curve through 
P,so we can associate the vector in Tp with the point an affine parameter dis- 
tance Ad = 1 along the curve from P. (Recall that if two elements of Tp are 
parallel, their geodesic curves have the same path but different parameters, and 
so the map picks out different points along the path.) Using this map and choos- 
ing an arbitrary basis for Tp, one defines the normal coordinates of a point Q to 
be the components of the vector in Tp it is associated with. This map will 
generally be 1-1 only in some neighborhood of P, since geodesics may cross on a 
curved manifold. For some connections, such as that of flat space, the map is 
1-1 over the entire manifold. (The map from Tp to the manifold is well-defined 
even if geodesics cross. It is called the exponential map. If it is defined for all 
elements of Tp at all points P then the manifold is said to be geodesically com- 
plete.) For our purposes the principal interest in the normal coordinates is that 
I"; = 0 at P (but not elsewhere in the neighborhood of P). To see this, note 
that if a vector U with components U'(P) defines a geodesic curve, then the 
coordinates of the point with affine parameter λ along that curve are simply 
x' = \U'(P), with the convention that \ = 0 at P. Therefore d2x'/d? vanishes, 
and (6.16c) tells us that I’,,U (P)U'(P) must vanish along the whole curve. At 
P, however, U’ had an arbitrary direction, which means that I’;,,(P) = 0. 

The fact that it is always possible to choose a coordinate system to make Γι, 
vanish at a point will be of great help in proving several theorems below. Since it 
is not necessary for ly to vanish anywhere else, the derivatives of in at Pdo 
not vanish. 


6.8 Riemann tensor 
One might expect that the commutator of two convariant derivatives, 


[Va, νν] = VaVo — WVa. 
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should be a differential operator. In fact, however, it has the following remark- 
able property: the operator R, defined by 

¢ [νο. Vel — Viea,71 = RU, V), (6.18) 
is a multiplicative operator. Even more remarkably, R does not depend on deriv- 
atives of U and V either. These properties are explained and proved in the 
following exercise. 


Exercise 6.13 
Prove, for an arbitrary function f, that 


(a) R(U, V)fW = fR(U, VW, 
(b) R(fU, V)W = fR(U, ΤΙ’. 


Because of these properties, (6.18) actually defines a tensor, which is called the 
Riemann tensor. Given vectors U, V, (6.18) shows that R(U, V) is a (1) tensor, 
since the left-hand side operates on a vector to give a new vector. With U and V 
also regarded as variable arguments, the Riemann tensor becomes a (4) tensor. 
(N.b. the conventions used for defining the Riemann tensor, (6.18) and (6.19), 
are by no means universal. Other definitions may differ in sign and the ordering 
of arguments. When consulting other books, make sure you find what conven- 
tion is being used. We follow Misner, Thorne & Wheeler (1973).) 


Exercise 6.14 
The components of the Riemann tensor, R' ip); are defined by 


[νι, Vile — Viz,.e1@r = R'nijer- (6.19) 
(a) Show that in a coordinate basis 
4 Reg Τι ΠΕ Τί ΤΠ ην. (6.20) 
(b) In a noncoordinate basis define the commutation coefficients Clin by 
le;, e,| — C' ip ej. (6.21) 
Show that 


Reg Τρι nig TOM Γι my — Cy em: (6.22) 
where f ; =e;[f]. 

(c) Show that 
Κι = (δη +R ei) = 0, (6.23a) 
and 
Κιμ = 0. (6.230) 
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(Hint: for (6.23b), use normal coordinates. The result, of course, is 
independent of the basis.) 

(d) Using (c) show that in an n-dimensional manifold, the number of 
linearly independent components of R' nij is 


7” πα τι) μα 1-2) _ 


4 5 n 31 $n?*(n? — 1). (6.24) 
Exercise 6.15 
Show that 
R' etijzmy = 0. (6.25) 


These are called the Bianchi identities. (Hint: again work in normal 
coordinates.) Show that this result is equivalent in a coordinate basis to 
the Jacobi identity for covariant derivatives 


[νι [νι] + (Vj. νε Vil] + (Ve. [Vis Vil] = 0. 
Cf. equations (2.14) and (3.9). 


6.9 Geometric interpretation of the Riemann tensor 

Like the interpretation of the other commutator we have studied, 
[U, V], this involves a closed or almost-closed loop. Our approach will be based 
upon the exponentiation of the covariant derivative, and so will closely parallel 
that for the Lie bracket. If a vector field A is defined along a curve whose tangent 
is U, then parallel-transport permits us to bring A from any point Q on the curve 
to any other point P. The vector so produced, A(Q > P) in Tp (not in general 
equal to A{P)) is called the image at P of A(Q), and it depends of course on the 
curve. In fact, if A and U are analytic we can write the Taylor series 


4(0 -») = A(P)+AVGA(P) + HE VGVGA(P) +... 
exp [AVg]Alp, (6.26) 
where λ is the curve’s parameter (U = d/d)) and the ‘exp’ notation is again just a 
shorthand for the line above it. 

Now consider two congruences with tangents U = d/dd and V = d/du, for 
which [U, V] = 0. Their intersections therefore form closed loops, as shown in 


figure 6.6. If we parallel-transport a vector from some point R to Q along a curve 
V as shown, we thereby define a vector at 0 


A(R >Q) = εχρ[μνν]ά!ο. 


where µ is the parameter distance from Ο to R. If we then parallel-transport the 
resulting vector from Q to P, we get at Pa vector we call 


A(R >Q>P) = exp [AVa] exp [μνν]41,, 


| 
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where λ is the parameter distance from P to Q. We could have done the trans- 
porting another way, namely by first going to S (a distance λ along a U-curve) 
and then to P (a distance µ along a V-curve). The values of \ and ware the same 
as above because U and V commute. The second method would produce 


A(R >S->P) = exp [uVy] exp [AVgJAlp. 
Their difference, which we shall call &4, can be found for small λ and µ by using 
the Taylor expansion: 


6A = [εἣ" Ὁ HY] A 
= [1 Ελνς ΓΣλ2νονς. 1LtuVe + ou°WWIlA + OG), 
where O(3) means terms in ”\”", where n + m 2 3. Evaluating this gives 
5A = du[Vo. WIA + Ο(3), (6.27) 


which is of course just the Riemann tensor, and does not involve derivatives of 
A. Viewed another way, this is the change in A that would be produced if we 
were to parallel-transport it around the loop PORSP. This change is just the 
Riemann tensor times the ‘area’ of the loop, Au: 

SA! = wR ip, AlUPV!. 

Another important geometrical aspect of the Riemann tensor involves geo- 
desic deviation, the fact that geodesics begun parallel do not stay parallel. To 
measure this precisely, we consider a congruence of geodesics with tangent U 
(νο ῦ = 0), and a connecting vector £ which is Lie dragged by the congruence 
(£& = 0) (see figure 6.7). The manner in which ἕ changes along U will be our 
measure of geodesic deviation. Its first derivative, Vg, depends upon initial con- 
ditions, upon whether the geodesics are set up initially parallel or not. The geom- 
etry enters into the second derivative Vi Va£, which tells how the initial rate of 
separation of the geodesics changes. So we have 


Fig. 6.6. Parallel transport around a closed loop generally does not 
return the same vector as it began with. 
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νονσξ = Vo(faé + νεῦ) 

= VaVeU 
[νο. Ve]JU + νενσῦ. 
The first step used exercise 6.11. The last term in the last line vanishes because 
U is a geodesic, so we have 

νρνσξὲ = R(U, 2)U, (6.28a) 

or in component form 

(&.jU).U" = Rij U'U"E'. 
Notice that the left-hand side can be simplified because U’ ,,U" = 0, and so we 
get 


εἰ UU" —_ κι". (6.250) 


Equation (6.28) is called the equation of geodesic deviation. 


6.10 Flat spaces 

Euclid’s axiom that parallel lines when extended never meet is the 
defining axiom for a flat space. From (6.28) it is clear that this means that a 
space is flat if and only if the Riemann tensor vanishes. Thus, the Riemann 
tensor is the measure of the curvature of a manifold with a connection. A flat 
space, by (6.27), has a global notion of parallelism: a vector at point R can be 
said to be parallel to one at P, because it can be parallel-transported to Pina 
manner independent of the path. Thus in a flat space all tangent spaces Tp may 
be identified with each other. Moreover, the exponential map is extendible 
indefinitely (provided the manifold’s global topology is not artificially compli- 
cated by ‘cutting and pasting’) and the entire manifold may be identified with its 
tangent space. Notice that none of this requires a metric tensor. Minkowski 
space is just as flat as Euclidean space. 


Fig. 6.7. A connecting vector ἕ Lie dragged along a geodesic congruence. 
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Exercise 6.16 
Consider a two-dimensional flat space with Cartesian coordinates x, y 
and polar coordinates r, 0. 
(a) Use the fact that e, and e, are globally parallel vector fields (e,.(P) is 
parallel to é,.(Q) for arbitrary P, Q) to show that 
Mog = —7, Ing = Τζο = Ir, 
and all other I's are zero in polar coordinates. 
(b) For an arbitrary vector field V, evaluate V;V’ and V;V' for polar coordi- 
nates in terms of the components V" and V®. 
(ο) For the basis f = 0/dr, θΞ (1/9)9/9θ find all the Christoffel symbols. 
(d) Same as (b) for the basis in (c). 


This exercise makes the important point that, although on a flat manifold 
coordinates exist in which a = 0 everywhere, it is possible to choose coordi- 
nates in which they do not vanish. 


6.11 Compatibility of the connection with the volume-measure or the metric 
If a manifold has not only a connection but also a volume form or a 
metric, one usually makes certain compatibility demands. For example, both the 
connection and the volume-form can define the divergence of a vector field V. 
The covariant divergence is V- V=V;V’. The volume-form divergence is defined 


by . 
{ὀφῶ = (ἀναογ)ῷ. 


We say that V and ὤ are compatible if divs V = V~ V for all V. 


Exercise 6.17 

(a) Show that V and are compatible if and only if νῶ = 0. (Hint: use 
exercise 6.11 to evaluate £7.) 

(b) In coordinates (x',...,x”) suppose w4._, =f. Show that Vand & 
are compatible if and only if for all k 


(Inf), = ip: 


In a similar way, there is a natural compatibility demand if the manifold has a 
metric tensor g]. Two vectors A and B have the inner product g|(A,B) at a point 
P.We say that V and Οἱ are compatible if this inner product is preserved by 
parallel-transport of A and B along any curve, for any vectors A and B. 


(a) 


(b) 
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Exercise 6.18 


Show that V and gj are compatible if and only if 

Vg| = 0. (6.29) 
In coordinates (x’,...,x”} show that V and gj are compatible if and 
only if 

Γι = ΣΕ (ευ Έξι) — Sin,v- (6.30) 


Here g” are the elements of the matrix inverse to the matrix of compo- 
nents g;,, (cf. equation 2.55)). (Hint: use the symmetry lip = Γι.) 


Exercise 6.19 

Recall that exercise 4.13 enables one to define a preferred volume-form 
if one has a metric. (This is another compatibility, that of the metric 
and volume-form). Show that if the metric and connection are com- 
patible, then the preferred volume-form and the connection are com- 
patible. (Hint: you will have to show that g;, = gi 'g:; ,. Use equation 
(4.39) for this purpose.) 


Equation (6.30) shows the remarkable fact that a metric actually determines 
the compatible symmetric connection uniquely. Such a connection is called a 
metric connection. 


6.12 


Exercise 6.20 
Show that for an arbitrary vector V 


(L790) = ViVi+ ViVi. 

Therefore a Killing vector (cf. 53.11) obeys Killing’s equation 
ViV; + Vj V; =(. 

Cf. equation (5.89). 


Metric connections 
Because (6.30) is such a strong constraint on the connection, metric 


connections have additional properties that general symmetric connections do 
not. To derive some of them it is easiest to work in a normal coordinate system. 
Notice that (6.29) and (6.30) imply 


i, = OatP>8imn = OatP. (6.31) 
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Exercise 6.21 
Show that (6.20), (6.30), and (6.31) imply that in normal coordinates 
at a point P 


Κυ = δι in = 2 (Bit, jx — Bir, jt δι δν). (6.32) 
Exercise 6.22 


(a) Show that (6.32) implies the identity 
Άι = Reniy- (6.33) 
(b) Show that (6.33) and (6.23) imply that in an n-dimensional manifold 


the number of linearly independent components of Κυ is 
gn(n — 1)\(n? —n + 2) -- n(n —1)(n—2)n—-3) = τη (12 -- 1). 


Exercise 6.23 
(a) Define the tensor R,,, called the Ricci tensor, by 


Re = R' ni, (6.34) 
and the Ricci scalar R by 
κ = ο κι. (6.35) 


Show that R;; is symmetric. 
(b) Show that the contracted Bianchi identities 


Ειναι = 0 and ϱ) ην = 0 

imply 

(R" —43Re")., = 0. (6.36) 
(Raising indices on R” is accomplished by the metric: RY = ο σκι) 
Define the Weyl tensor: 

C8 = Ry, — 25 Ry + 56h, OR. (637) 
Show that every contraction between indices of (μι gives zero: it is a 
‘pure’ fourth rank tensor. 


oo 
Ωω 
— 


Equation (6.36) plays a fundamental role in Einstein’s theory of gravitation 
(general relativity). Spacetime is represented as a four-dimensional manifold with 
metric, a generalization of flat Minkowski spacetime. The empty-space (source- 
free) gravitational field (i.e. metric) is found by solving the differential equations 


GY = R¥—iRg" = 0, (6.38) 
where G¥ is called the Einstein tensor. The identities (6.36) reduce the number 
of independent equations in (6.38) from 10 (=4n(n + 1) because G” is sym- 
metric) to 6. This guarantees that the solution, g;;, which also has 10 independent 
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components, is determined only up to the four functional degrees of freedom 
represented by the coordinate transformations of g;;. 


Exercise 6.24 
Show that a geodesic joining points P and Q is a curve of extremal 
length among all curves joining P and Q. Do this by showing that 


Q 1/2 
|, | 


is unchanged to first order by changes in χ(λ) away from a geodesic 
curve. (Bear in mind that any curve has a unique parameter; a geodesic 
curve’s parameter must be affine.) Discuss the need for the absolute 
value signs above if the metric is indefinite, and in particular discuss 
separately the case of a null geodesic (length zero). 








6.13 The affine connection and the equivalence principle 
We all learned our basic geometry and physics by studying flat mani- 

folds: Euclidean three-space, Galilean spacetime (though it probably was not 
given that name) and later (if at all), Minkowski spacetime. General relativity, on 
the other hand, uses a curved spacetime. It seems natural to think of a flat space 
as the simplest kind of space. But from the point of view of manifold theory, 
even a flat space is by no means simple: it has far more structure than the ordin- 
ary differentiable manifold, for it has an affine connection. The existence of this 
connection does not intrude into elementary geometry and physics, because one 
usually adopts rectangular coordinates, in which the Christoffel symbols vanish. 
But if the physical laws are framed in flat space using curvilinear coordinates, 
then the Christoffel symbols must be used, and the connection becomes visible. 

It may seem that this is a complication to be avoided, but consider its potential 
for generalization. Most physical laws written in this way involve the Christoffel 
symbols but not the Riemann tensor, so their equations are meaningful — 
identical — whether the manifold is flat or curved. It is therefore natural to 
postulate that in the curved spacetime of general relativity the laws of physics 
have exactly the same mathematical form as they have in a curved coordinate 
system in the flat spacetime of Minkowski space. This is called the principle of 
minimal coupling (of physical fields to the curvature of spacetime), or the strong 
principle of equivalence. It is a postulate, widely adopted, which is consistent 
with experiment. A full discussion of it can be found in Misner et al. (1973). The 
point that needs to be exphasized here is the rather remarkable circumstance 
that, by expressing the flat-space laws of physics in a curved coordinate system, 
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one obtains the curved-space form of the laws. This circumstance can be traced 
to the fact that flat space, though having zero curvature, has a perfectly definite 
connection and is therefore only a special kind of ‘curved’ space. 


6.14 Connections and gauge theories: the example of electromagnetism 
‘Gauge theories’ is the collective name for a large variety of theories of 

elementary particle interactions, all of which share one feature: invariance of 
their physical predictions under a group of transformations of the basic variables 
of the field theory. Electromagnetism is the best-known example: if the basic 
variable is taken to be the one-form (‘vector’) potential A, then the physical pre- 
dictions of the theory are invariant under the gauge transformation A>xAt df. 
The word ‘gauge’ is applied by analogy to the transformations of all these 
theories. A general discussion of gauge theories is beyond our scope (see the 
lectures by Trautman (1973) in the bibliography). We will confine our remarks 
to electromagnetism, illustrated with the equation for a charged particle of mass 
m and zero spin. We will see that a connection different from but in the same 
spirit as the affine connection arises in a natural way and in particular leads us to 
‘invent’ the electromagnetic field! 

Consider first the neutral scalar particle of mass m whose wave-function ψ 
obeys the Klein—Gordon equation and the (conserved) normalization condition 


(V, V4 —m)y = 0, | d3x(y*y —yw*) = 1, (6.39) 


where Greek indices run over (t,x, y, Z), and where we assume for simplicity the 
metric of Minkowski spacetime. Clearly, if ψ is a solution then so is We'®, where 
ϕ is any real constant. This is a gauge transformation: > We!?. We shall now 
make an analogy which will carry through our whole discussion. The gauge trans- 
formations are of a very restricted sort, since ¢ cannot depend on position. This 
is analogous to the coordinate-freedom in the description of some (any) physical 
system in rectangular coordinates in special relativity. The permissible coordinate 
transformations are the rotations, Lorentz boosts, and translations, and these 

are all rigid: one cannot make one transformation at one point and a different 
one somewhere else. Relaxing this restriction in special relativity in order to per- 
mit arbitrary coordinates forces one, as we have seen in the last section, to intro- 
duce the affine connection in order to preserve a coordinate-independent deriv- 
ative, the covariant derivative. Once the equations of motion of the physical sys- 
tem are written down with a connection in them, it is natural to use them when 
the connection is not flat. These turn out to be the appropriate equations for 
that system in general relativity. This procedure of generalizing the coordinate 
freedom thus leads to a theory of the way the system interacts with a 
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gravitational field. In a similar manner, we will now generalize the gauge freedom 
of the field y and find, automatically, a theory for how the field interacts with 
electromagnetism. 

The generalization is obvious: we would like a general gauge transformation 


y> per | (6.40) 
where now ¢ is an arbitrary real function of position X in Minkowski spacetime. 
But because the field equations involve derivatives, this produces a change 

dy > (dw +ivddjei?™, (6.41) 
To see how to eliminate this extra term, let us look at the situation more geo- 
metrically. The factor e'® is a complex number on the unit circle; the gauge 
transformation is a representation of the action of the group U(1) (unitary group 
in one complex dimension) on wW. So the transformation y > wei? can be 
thought of as picking out an element of U(1) at each ¥ and allowing it to act on 
y. The natural geometrical structure here is the fiber bundle, whose base manifold 
is Minkowski spacetime and whose fibers are the group U(1) (which can be visual- 
ized as the unit circle in the complex plane). A gauge transformation (6.40) is 
then a cross-section of the fiber bundle. We shall call this bundle the U(1 )-bundle. 

Now, the thing we want to look at is not y itself but V,,W, which at any point 
P is an element of 7” p, the vector space of one-forms at P. Consider a curve & 
with parameter A in the base manifold. As we move along the curve we encounter 
a sequence of one-forms dy, one at each point. If ψ satisfies the Klein-Gordon 
equation, (6.39), we will say that dy changes along @ in the ‘correct’ manner. 
There is a restricted set of gauge transformations (¢ = const) for which the new 
dy is also ‘correct’. Suppose we make an arbitrary gauge transformation. Then 
to the curve @ in the base manifold there corresponds a curve #* in the U(1)- 
bundle which passes through each fiber above a point of @ at the point on the 
fiber (element of U(1)) which corresponds to the gauge transformation at that 
point of @. If this transformation is not constant (if * is not ‘parallel’ to ¥) 
then the gradient of the transformed wy will not be ‘correct’: it will not equal, to 
within a phase, that of the original ψΨ. So we will define a connection one-form 
A on the base manifold, which will depend upon the curve * in such a way as 
to correct the derivative of y. The definition is: 


(i) If ψ solves (6.39) then A = 0. 
(ii) Under a transformation y > we’? the connection one-form trans- 
forms as 


4-4 + do. (6.42) 
(iii) The gauge-covariant derivative of w is 
Dy = dy —iWA. (6.43) 
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Properties (ii) and, (iii) mean that DW changes, under a gauge transformation, to 
el © Dy - property (i) guarantees that Dw is ‘correct’ on &. 

Let us now understand why A is called a connection. The affine connection is 
represented by the Christoffel symbols, which are added to the ordinary partial 
derivative in order to give a ‘correct’ derivative: one which gives parallel trans- 
port (cf. (6.43) with (6.10)). In order to preserve the ‘correctness’ of the deriv- 
ative, the Christoffel symbols must transform under a coordinate change in a 
manner which depends on the coordinate change (exercise 6.2) in a way very 
similar to the way A changes under a gauge transformation (6.42). The differ- 
ence between the connections is what they set out to preserve: an affine con- 
nection preserves parallelism; our one-form connection preserves the gradient 
under a gauge transformation. 

We can now write the gauge-covariant form of the Klein—Gordon equation: 

D,D*y—m>y = (Vz —iA, (V4 —iA*)y —m? yp = 0. (6.44) 
This equation reduces to the usual Klein-Gordon equation if the phase of ψ is 
‘correct’; and any ψ obtained by an arbitrary gauge transformation of a ‘correct’ 
W solves (6.44). 

The curvature tensor of an affine connection can be defined in a coordinate 
system by an equation like (6.18): 


[Vu WIV* = κιν”. 
The analogue here is 

μμ. Ὀν]ψ = Fury. (6.45) 
It is a straightforward calculation to show that the gauge-curvature two-form F 
whose components are ἔμν is simply 

F = —idA. (6.46) 
Clearly F is gauge-invariant (cf. (ii) above). (The Riemann tensor was coordinate- 
invariant.) The Klein—Gordon equation is gauge-flat (F = 0) because there exists 
a gauge in which A = 0. But because of the obvious analogy with electromagnet- 
ism (A = one-form potential, if’ = Faraday tensor: see chapter 5), it is clearly 
tempting to regard (6.44) as a generalization of the Klein-Gordon equation to 
the case where the particle has charge and interacts with an external electromag- 
netic field F. This is in fact correct, and (6.44) can be derived more directly 
from the fact that the canonical momentum of a classical particle in an external 
electromagnetic field is ῥ. = p + (q/c)A, where ῥ is the ‘true’ four-momentum 
of the particle. By the correspondence principle the equation p -p επι = 0 
becomes (in units where A = c = 1) 


(—iV, Αμ iV" —qA*")+m’*y = 0. 
This shows us that (6.44) is the wave equation for such a particle with charge 
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4 = 1. We can summarize what we have learned in the following way: a scalar 
particle of mass m and charge q in the presence of an external electromagnetic 
field with one-form potential A obeys the equation 


(V, —igA,(V" —igA")y —m’?yp = 0. (6.47) 
A gauge-transformation consists of the following: 

A>A+ dg, (6.48a) 

Wy > peiela, (6.48b) 


We can regard A as a connection on the U(1)-bundle and F as its curvature. 


Exercise 6.25 
(a) Verify that Dy > e®™ Dy under a gauge transformation. 
(b) Verify (6.46). 
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SELECTED EXERCISES 


[7,6] = —6/r#0. 

Since ad/dA + bd/du = bd/du + ad/dd, equation (2.13) implies 

exp [ad/dA] exp [bd/du] = exp [2ά/άμ] exp [ad/dA], which certainly 
implies [d/dA, d/du] = 0. Conversely, if the order of d/dA and d/du on 
the right-hand side of (2.13) does not matter, then they may be mani- 
pulated exactly as real numbers, for which (2.13) is true. 

Expand out each term, e.g. [[X¥, Y], Z] =X YZ — YXZ — ZXY 

+ ZYX. Each such term is to be interpreted as a differential operator 
on functions. The 67 requirement guarantees each term exists. When all 
three terms in (2.14) are so expanded, the result follows. 

Each matrix is a (}) tensor requiring one vector and one one-form to 
give a real number. Since there are two matrices involved, transfor- 
mation can produce a number when supplied with two vectors and two 
one-forms. Linearity is easy to check. 

In n dimensions a (2) tensor has n? independent components, while 
two vectors have between them only 2n components. In general this is 
inadequate. 

A linear combination of two (0) tensors is defined in terms of their 
values on arbitrary one-forms ῥ and J: (ah + Br) (5. J) = ah (5, J) 

+ Br (B, ᾷ). This is still a linear function of pf and ἆ, and so is a (4) 
tensor. The zero tensor has value zero on any pf and q, and the other 
axioms are also obvious. The space has n? dimensions because each 
tensor his completely defined by its n? components h(&', @/) = h", 
The n? tensors {ξι ® é;} are a basis because they are linearly indepen- 
dent: the linear combination βὗ δι ® é; vanishes if and only if all BY 
vanish, as can be seen by allowing the tensor to operate on all pairs of 
basis one-forms. In fact it is easy to verify that h = μη €; QE}. 

Six. Six. 

Linearity: C(aV + DW) = B(A(@V + DW)) = B@A(V) + DA(W)), 
aB(A(V )) + DBB(A(W)) = aC(V) + bC(W). 

Components: if A(é;) = A*.é, then C(é;) = B(A*.é,) = B(é, λα”, 

= B',A®, é;, from which the result follows. 
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2.10 
2.11 


2.12 


2.13 


2.14 


2.15 


2.16 
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Each (1) tensor is a linear transformation of Tp. Our result shows that 
linear transformations form a group under the operation of composi- 
tion (by which C was produced from A and B). 

τ(ῶ", a d= TAS 20" JN, 6) = A! Nw 17 (@*", &!). This generalizes 
to Ti =A ΔΙ ‘My AMY T™ Fs, , where 

(ii... nhs are 5 N indices and α a 1 are N’ indices. 

True of any vector space: the zero element is unique. 

The value of the tensor (in the original basis) on a one- form p and a 
vector V is A?,V'p;. In the new basis it is At, ye p;' = (A',AS;'A’,) 
δι, V®) (A!.:p)), where we have used the transformed components 
of everything. Doing the sums oni’ andj’ first and using (2.34) gives 
that the new value is the same as the old, i.e. that the rule gives a real 
number associated with the vector and one-form, independent of the 
basis. 


@) A= εν 12 


1 
produces the canonical form 0 


6 ΙΝ2 i): 

_{ IN2 12 | -1 0 
(b)A = [_ 67 va produces the canonical form | 0 η 
(ο A= ° M2 produces the canonical form Γ a 
a 


(a) The transformation law can be deduced from equation (2.55). As 
function of one-forms, | :(δ, 7) = 9\(P, J) leads to (2.55) and 
clearly shows the linearity property. 

(b) In such a basis, the metric has components + 6,;, so ϐϱἱ ' has compo- 
nents + 5” as well, being just the inverse matrix. 

Make a Taylor expansion of 6η and ΛΙ, about P. Try to satisfy (i) and 

(ii) by choosing the coefficients of the ΔΙ, expansion appropriately for 

arbitrary coefficients of the g;; expansion. Show, by counting the 

number of coefficients, that (i}-(iii) hold. Remember that not all OAt,/ 
ax” at P are independent, since Ai = = dx!/ax/ implies dA‘ /οχ 

= ΛΙ, fax? 

(a) Sr = 1,879 =0, 299 =r’. 

(0) orthonormal. # = 0/dr, ΘΞ 1 9/90. 

(a) df has components (0f/0r, 0f/30); df has components (9//97, r°? 
of/00@). 

(b) On the orthonormal basis, both df and df have components (9//9Υ, 
r‘of/de). 

(a) On functions, equation (3.3) shows that each side of (3.8) reduces 
to the operator [V, W]. On a vector field U, equation (3.6) gives 


3.2(b) 


3.3 
3.4 


3.5 


3.6 


3.7 


3.8 


3.9 
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the left-hand side as [V, [W, U]] — [W, [V, U]] and the right- 
hand side as [[V, W], U]. The Jacobi identity (exercise 2.3) and 
the antisymmetry of the Lie bracket establish the result. 

(b) On functions, we have the Jacobi identity (2.14) again. On vectors, 
equation (3.8) converts (3.9) into the Lie derivative with respect to 
(LY, ¥],Z] + ([¥,Z], Χ] + [[Z, X], Y], which vanishes. 

£5U = εν(ύῖε) = (£pU')e, + U' Eve; = [Vi2,(U)) 2, 

εν) = [V2(U!) — Ul2,V]é;— U"V' £5, 2; 

The last step required relabelling of indices. Use of (3.7) in the last term 

produces the desired result. 

Follows from (2.7) with V? = 6',. 

From (3.13) we have 


~ 3 | 8, δ., 
ανν) Va eo( Mw wor 


[v2 9 wrt ωμτν] wi 
ax! J Ox! 2 


where indices have been relabelled in the second term. The arbitrariness 
of W' gives the result. 


(a) _ _ . _ 
ΣαωωιΣ βω4(ω | = 2, [αω δω, βω)Α)] 
a a, 





= Σ, {αωβω[άω. Ae] + ew [4@Bey)] Ae 
a,b 
— By [A@@@)] 4@} - 


oo ὂ . 
(LoT) Rn = VG ΤΙ" ορ wed Tg, wd aoe 


, ὃ a ὃ ” ὃ 
-. Τε 1 πο Vy? + Tees 1 axF yr +, - + Trev, ay! VT. 


Set V' = 6', and get (ΕΡΤ), ι-- AT* 4, ,/dx!. 

[L*, ἐπ] = ἐπ £7] + [τσ ἐπ] £5, + &, 1 4,. £7] 

+ [£i, £7 | £7, = fF £7, + £7, £5, — fF, fr — £7 £7, = 0. 

The second step used equation (3.8). To prove (3.33), derive the 
following relations: 1,, = — sin ϕ 9/90 — cos ¢ cot ϐ 0/09; 1, =— cos ϕ 
0/06 + sin ϕ cot 6 9/φ: 1. = /d¢. 

Follows trivially from £,7 = a£7 if a is constant. (This is not true if a is 
a general function.) 

When Lie dragged along ¢ from ¢ = 0, 8, is unchanged but @, becomes 
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é, = cos ϕ é, + sin ϕ ἔγ. The third basis vector, é,, has Cartesian 
representation — sin ¢ é, + cos ¢ ἔψ. The three vector harmonics are 
exp (2i¢) 8., exp (2id) (cos ¢ é,, + sin  @,), exp (2id) (— sin ¢ ἐς 

+ cos ¢ ἔν). It might be more useful to use the more compact linear 
combinations exp (2ip) é,, exp (id) (€,. + i@,), exp Bid) (6. — ié,). It 
is obvious from these that 6, + ié,, has eigenvalue + 1 and ες — ié, 
eigenvalue — 1, but these are easy to verify directly. The one-form dz 
is unchanged by Lie dragging, and dx becomes dr = cos ¢ dx + sin ϕ 
dy. The third basis one-form, dg, has Cartesian representation — sin ϕ 
dx + cos @ dy. The three one-form harmonics are, then, exp (2ib) dz, 
exp (id) (dx + idy), exp (3i¢) (dx — idy). Since ἐς of = 2if, d (£5 ot) 

= 2idf. But it is easy to show (using cylindrical coordinates) that 

d(£5 AN=£5 (df), which completes the proof. This is a special case of 
a general theorem proved in §4.21 below. 

A right-invariant vector field is invariant under the map Rg, generated 
by right translations analogously to L,. Figure 3.10 applies here, too, 
so they form a Lie algebra. The integral curves through e of a right- 
invariant vector field are one-parameter subgroups for the same reason 
as for left-invariant integral curves. But the subgroups are in 1-1 corre- 
spondence with the two sets of integral curves, so the curves are the 
same. The integral curves of a left-invariant field not passing through e 
are obtained by left-translation of those which do, i.e. of the one- 
parameter subgroups. The curve of V through, say, h is μεν, (1). The 
right-invariant curve through A is g7_,(t)h. This is not the same unless 
hand g7,(t) commute. 

(a) Because left-translation by h' is a 1-1 map of neighborhoods of h 
on to neighborhoods of 6, the vector-field map 1, is also 1-1 and 
invertible. It follows that if {V;(e)} is linearly independent then so is 
{L,V;(e) = V,(h)} for any h. 

(b) The point is that the fields {V;} are globally a basis, so any vector 
field is defined by giving {a,(g)} for all g. This maps TG onto G x R”. 
(b) The key step is that (8 14)” =B''ABB'AB...B'AB 

= B'A"B. 

(c) Block-diagonal matrices are easy to exponentiate, since 


P, ο ο (51): 0 0 
0 2, 0 0 (P,)” ϱ 
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In case (i), exp (fA;) is the usual exponential function. In (ii), 


κο L(t πείς 0 1 (1 ] 
| i, a 0 κα τι 
po 8 | |; 1) 
0 r; — 1s; —] 
al) | expen) (O° ο | 
V2\i -ἶ 0 cos tr; — isin ts; 


πα - 
al, .. 


from which the answer follows. For (iii), some experimentation will 





verify that 
x 1 0 0 “ 
ο x 1 ο 
ο 0x 1 
d 1 d? 1 dad 
πι - n _ οι __ πι 
eae 212 31 
d 1 d? 
n μαμα. _ — Φα 
0 x dx ο] de? 
7 d 
0 0 x” — x” 


When multiplied by t” and put into the exponentiation sum, this gives 
(3.59c). 
3.13 The sequence of matrices 
cost sin (t/ . 
—sint cost 


is a continuous path containing e (for t = 0) and 


vs 
ο -ιί 

(for t = π). The matrix is not in a one-parameter subgroup because it is 

not the exponential of any matrix. This follows from exercise 3.12. It 


3.14 


3.15 


3.16 


3.17 
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is easy to verify that none of the forms (3.59) can be transformed as in 

(3.56) to give the desired matrix, because of the negative elements on 

the main diagonal. 

(a) An eigenvalue A of A satisfies the equation det(A — AJ) = 0 
= det(A — AJ)" = det(At — AJ) = det(A7! — AJ) and so is an eigen- 
value of 41. The converse also holds. 

(b) det(A — AJ) = 0 > 0 = det(A!)det(A — AJ) = det(— AA) 
= det(— AJ) det(A“' — XZ). None of the eigenvalues is zero since 
det(A) # 0, so we conclude det(A~! --λ 1) = 0. Thus, if A is in 
O(n) and λ is an eigenvalue of A, so is 1/A. But the equation det(A 
— AI) = Ois real, so its solutions come in complex-conjugate pairs. 
In order for these pairs to be inverse they must have the form (e’? , 
ei9) 

(c) These forms are just (3.58a, b) for the given eigenvalues, with 
(3.62b) being a special case of (3.62c). The only case to exclude is 
(3.58c) when µι = + 1. This form is impossible because B"' AB is in 
O(n) while (3.58c) is easily seen not to be. 

(d) The Lie algebra can be found by looking at the tangent space of any 
element, in particular of e, and so we can restrict attention to the 
generators of the one-dimensional subgroups of SO(n). The problem 
may be solved by examining canonical forms, but the following 
method is quicker, Consider the element exp(tA ), where A is in the 
Lie algebra of O(n). Then [exp(tA)]"! = exp(— fA) and [exp(tA)]* 
= exp(tA’). These are equal for any t, so A’ = — A. The converse 
is proved in the same way. The dimension of O(n) is the maximum 
number of linearly independent antisymmetric n xX n matrices, 
4n(n — 1). 

A matrix A is in SO(n) if and only if its canonical form (3.62) has an 

even number of blocks (— 1). An element of O(m) not in SO(7) has an 

odd number of blocks (— 1), and may obviously be obtained from one 
in SO(n) by the given transformation. 

As in the previous problem, the canonical form of A in SO(n) has an 

even number of blocks (— 1), which may be ordered to be a special case 

of (3.62c) with @ = 7. Any canonical form (3.62a) or (3.62c) is a 

special case of the exponentials (3.59a) or (3.59b). For SO(3), the 

canonical form must be one block (3.62a) and another (3.62c). The 
eigenvector for (3.62a) is the axis of rotation. 

Use equation (3.60). 


3.18 The matrix diag[exp(ia,t), exp(ia,t), ...] is the exponential of diag 


(ia,t, ia,t,...). The first matrix has determinant exp(it 2,a;), and if 
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this equals 1 the second matrix is traceless. Let us establish a corre- 
spondence between complex numbers and real 2 x 2 matrices, defined 
byat+ib@ (_% 2). Then multiplication preserves this: (a + ib) (ο + id) 
+ (_¢°)(_$ 2). There is thus a group isomorphism between the com- 
plex numbers and matrices of this special form. (It is in fact an algebra 
isomorphism, since it is preserved by addition as well.) This generalizes 
to a group isomorphism between GL(n, C) and the subgroup of GL(2n, 
R) consisting of matrices built of 2 x 2 blocks of the form ( 5 9). 
Hermitian conjugation in GL(n, C) is simply the transpose operation in 
GL(2n, R). Thus we may regard U(n) as a subgroup of O(2n). Since 
O(2n) is generated by antisymmetric 2n x 2n matrices, U(n) is gener- 
ated by anti-Hermitian matrices. These must be trace-free by our first 
observation and by exercise 3.20(a). 
3.20 (a) (BAB), = (B') A,B” ,. But (B')',B™; = 6)", so tr(B LAB) 
= tr(A). 
(b) Since ἀθί(Β 1) = 1/det(B), we have det(exp(A)) = det(B™' exp (A)B) 
= det(exp(B' AB)); moreover, exp(tr(A)) = exp(B ‘(tr A)B) 
= exp(tr(B 'AB)). Thus, we need only prove (3.67) for the various 
canonical forms. The form (3.58a) is trivial. For (3.58b), inspection 
of (3.59b) proves the result. The same is true of (3.58c), for the 
matrix written in (3.59c) has unit determinant. 
3.21 (a) Use the identity @ x b) x C= (@-@)b — (b «δ)ᾶ. 
3.22 (a) Note that det (_¢ 2) = |a|? + |b/? only vanishes for the zero matrix. 
(b) The dimension is 4 because there are 4 real numbers freely chosen 
to define an element of H. 
(d) Equation (3.73) is the equation for S°, so the 1-1 mapping is estab- 
lished by associating the point (αι, αλ. 3, αι) of S? in R* with the 
matrix A. 
3.23 Since [g(s)]"' = exp (— sY) we can write (3.79) as 


exp (sY) exp(tX) exp(—sY) = exp [tAdy(X)]. 
Differentiating both sides with respect to { at t = 0 gives 
exp(sY)X exp(—sY) = Ady (X). 
Expanding the left-hand side in powers of s gives 
¥+s(¥,X] +487 (¥, (V7) +497, [ΣΤ +.... 
proving the result. 

3.24 (b) Ικ(Υ1 1) =iY1 0 V2: ly(¥, 1) = Yio /V2;1.%1-1 )=i¥i-4; 

L(Y J=i1%14 -N i WMV231,(Y10) (νι + Y; 1 V2; 


l.(Y10) = ο: (Σι 1) Ξ —iYy0/V2;1,(1%1 1) Ξ Y10/V2312(¥ 1) 
= iY, 1° 
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3.26 
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Any complex linear combination of the three functions will thus be 
transformed into another one by differentiation along ᾖ., l,, or J,. 


The first of (3.30) implies /,(f) = 0. The remaining two then imply 
1,.(f) = 1, (f) = 0. Thus, f must be constant on the sphere. 
; 3 1/2 ot 0 
MJ ke > 0 /2 3 
81 
i 0 
1 0 1 
1/2 
ΛΑ, = 5 | O πι; 
0 2 ο 
0 1 0 
Xi = ---- 1 0-1): x, = Ly. 
V2 ) 
ο -ι 0 
B(U+ W,U+ W)=0=B(U, U) + Β(ὔ, W) + B(W, U) + B(W, W) 


= B(U, W)+ B(W, U). 
(a)p(...,0,...,W,...) =p ig UW =p UW! 
=—p(...,W,...,U,...). 
(b) Follows from Ajj, = — Αιξ Ap, εἰς 
(ο) A,B" =4.4,,BY +4.4;,B" =4(A,B" — A,B") =A, B™!. First 
step merely relabels dummy indices. 

(d) BY) =} [BG &’) — Β(ῶ', &)] =0. 

The components of & G) are its values on sets of p basis vectors. If p >n 

there must be at least two vectors which are the same. Exchanging these 

makes no change at all, but at the same time must change the sign of 

the component. The only number which equals its negative is zero. 

One need only demonstrate that the sum of two p-forms is a p-form, 

i.e. totally antisymmetric, and that a p-form times a number is likewise 

a p-form. The dimension of this space is the number of independent 

components a p-form can have, Ορ from equation (4.7). 

BAGU, V)=BU)GV) —FU)BV) = — BA QV, UV). Obviously 

pa p(U, V)=0 for any U, V. 

Check equation (4.9): aU, /)-δαι [ο (0)ῶ](Ψ) — GU) a(V)] 
=ta,(U'V! — VU") = bay U"V? - + 4a;,V'U! = a,U'V'. The number 

of independent two-forms &! a @* is 4n(n — 1), which is the dimen- 

sion of the space of two-forms. 

(1 + 1)” = Σρ-ο G by the binomial theorem. Notice that this includes 

p = 0, the one-dimensional space of zero-forms. 


4.8 


4.9 


4.10 


4.11 
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By equation (4.9) and its generalization we know that ¥ = (1/2!) 
χοιιῶ) λῶ" and Pa J = (1/3!) Ga Dip @ a G! a @". But Pa Z 
= (1/2!) pidjn ὤ  λ(ῶ) | 3) = (1/2!) pi ain @' a ὤ n G = (1/2!) 
X Pridjny © 0 G! λ G*. Since {G5' λ ὤ] λ.χ} form a basis, we conclude 
(DA Din = 3!/2!) Pudi =3(Pidin Ἔριαυ + PiVni — Dini 

— PrQji — PjWik) = Pidin + Pedi; + ιά πι. The generalization to equa- 
tion (4.11) is straightforward. The student is reminded that Cp? 

= CP*4 so that (4.11) treats p and J even-handedly. 

Equation (4.16) is bilinear in B and ἄ, so it suffices to prove it for the 
case B= ὤλ λ...λῶδ and ἂξ @P*1 An... 0 @?**. Then Ba &(E) 

= (BIA... AO? AOI A... A @P*) (E). This wedge product has 
(p + q)! terms, all possible permutations of the indices. The ones 
which have ὤἱ first contract with ἕ to give @'(E) (@* n...0 @? 

A Pt A... A G?*), Each term with 6° first is an odd permutation 
of one with @! first, obtained by exchanging @' with @*. Therefore 
these terms contract with ἕ to give — @?(E) (@i an Gian... @? 
λῶραλ...λῶδρ'α). Similarly there is a contraction 3 (1) (@! a ὤ- 
A GA...) and so on. Now the first p such contractions are just 

B(E) A ἄ, since they involve only the one-forms in β. 

The remaining g contractions are the contractions of ἕ with the one- 
forms in &, with β wedged in front, except that their overall sign is 
governed by the degree of @. That is, the first such term is (— 1)? @?*! 
(E) (BIA... ADP an BP? a... &?*), and all other terms are also 
(— 1)” times the terms that appear in βλ &(E). 

In Cartesian coordinates, W = U x V has components (U?7V* — U?V", 
UeV! —U'V?, UV? —U’V"). Equation (4.20), with w43 = 1, 
becomes *(U x V),, = (U x V)°, etc., which proves equation (4.22). 
(a) The right-hand side of the equation after (4.35) is simply a sum of 
(p + 1)! terms. The terms in which the lower index { is first number 
p!, and are the first in the next line. There are also p! terms in which 
the lower /m is first, and these are given (with the correct sign) by the 
second term. The final line before (4.36) performs the sum on i (δὲ 
=n). 

(b) Applying (4.36) (n — p) times to reduce the n-delta to a p-delta 
gives a factor 1°2...(n—p), as in (4.37). 


(a) The keypoint to observe is that ευ. κ ANA™...A™ = Al 
χει κά3. we A”) + Al (e,; 2A”. . An) +... =A" (Eq. 
x A?%,. A"P)— AM (ey gA2™... A") + —..., where in the final 


line Greek indices assume only (n — 1) values in each sum, missing out 
1 in the first sum, 2 in the second, etc. The pth term in ‘parenthesis is, 
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by (4.39), the determinant of the (n — 1) x (n — 1) matrix obtained 
by excluding the first row and the pth column of the original matrix. 
(b) Equation (4.39) is obviously antisymmetric if 2 is exchanged with 
1, and this leads to the desired result. 
Let the matrix ΛΙ, transform basis one-forms, @! = oe Then 
ῶ- λα Ad ο ο 
dx” -- ἀθί(Λ) ἀχλλ...λ μη. But the metric’s components trans- 
form by g;';" = ΔΑ, “AL j’&n1, Which as a matrix equation has the deter- 
minant det (g;';") = [det(A)]? det (g,;). (Note that the determinant of 
the transpose of a matrix equals that of the original matrix, by exercise 
4.12(b).) Moreover, the original basis was orthonormal, so det(g;;) 
= +1. It follows that det(A) = |det(g;';")I""”, proving the result. 
(a) Special case of property (2) with & the zero-form f, using property 
(3) to eliminate ddg. 
(b) Obvious. 
VE = Ny (MV) = My AUVs + My AP, iV". The second term 
is not part of the tensor transformation law. Now, in [U, V]’ the two 
wrong terms are U! VAP, ;— Viu'A’ , ;. This vanishes because 

= = Ai; 1 = 07x and ag 
Cuiferad f) = *d(df) = *(ddf) = 0. 
Div(curl ἆ) = d*(*da) = d(**da) = dda = 0. 
The first expression in (4.64) is the component version of ἀᾶ = 0. The 
next step is identical to that in exercise 4.11(a). The second term on 
the right-hand side of (4.64) is obtained from the first by simply ex- 
changing yi and 7; this an in a sign change and shows they are equal. 
Div(@) = d*a = 0 => *a = db for some 6 >a = "4 = curl(). 
Curl@) = *da=0> a= 0 >a = df= grad(f). 
Choose a vector basis (61. @,,..., δι) for which (@2,...,é,,) are 
tangent to 0U: f#(é,) = 0 for 2 <p <n. This means that 7(@,) # 0, 
since # must have at least one nonzero component. Since @ is an n- 
form, it has only one independent component, which is O(é,, 
@n,...,0n) = (HAN Q) (1, @n,..., &,). The only nonzero term in this 
occurs when fi contracts with @,,so we have #(€,)@(@,,..., δι). But 
@(é,,..., &,) is the one independent component of ἄ]ου in this basis, 
which is therefore found to be ὤ (δι... ., @, )/A(é, ). But & itself is not 
unique: @ + fv for any function f works just as well. Now, the only 
requirement of # is that it be normal to dU. On our basis this means n 
has components (n', 0,..., 0). Clearly any two normals ff and 77’ are 
related by 7 = ffi’. By our previous result, this changes Gg, to f* 
ἄιου. 5ο that 7(£) @57, is unchanged. 
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Following the steps from (4.76) we find ἆ [ῶ(Ε)] = (053) i 

xdxt an... nde" =f (fF) ιῶ. 

Use exercise 4.13, showing that the metric in spherical polars is 
diag(1,r?,r? sin?@). Then div £ follows from setting f= r? sin ϐ in 
(4.80). 

From (4.67), £¢(p@) = ἀ[ρῶ(Ρ)] = d[@(pV)] = div(pV)&. 

(a) Since @(E) = *£, the dual of (4.77) gives (4.81) immediately. 
(b)*F is a(n — p)-form, d*F is a(n — p + 1)-form, so *d*F isa 

(p — 1)-vector. Equation (4.83) is proved by a simple generalization 

of (4.76). 

(ο) (div ¢ Fed — fo (fF*---) he 

(a) If 6 = da then Γῶ-τ G GZ; but the second integral vanishes since 
there is no boundary. . 

(b) d@ = dx! a dx? a dx? is the usual volume-form, so if B is the unit 
ball (the interior of S? in R*) then fy dé = volume of ball = 47/3. By 
Stokes’ theorem, fp ἀῶ = fg? @lg2. Now, any two-form on a two- 
dimensional manifold is closed, since all three-forms vanish identically. 
Thus, @ is closed, but violates (a) above, so it is not exact. 

(ο) We are given § defined everywhere on S? with dg = 0. Integrate dg 
over any region of S? bounded by a single closed curve ¥ to find 

$e B = 0 for any &. This can only be true if 8 = df for some f: other- 
wise some curve ¥ could be found on which 8 would have a non- 
vanishing integral. In fact f can be constructed by choosing an arbitrary 
value fo at a point P and integrating 6 on any curve from P to any point 
O, defining f(O) = fo + SB. The condition § β 0 guarantees that 
f(Q) is independent of the path from P to Q. 

(a) (i) and (ii) are trivial. For (iii) suppose & — B = dg, ,B — ¥ = ἆᾳ.. 
Then ᾱ-- ¥ = d(f, + A). 

(b) First prove that if βι ~ &, and β; 5’ & then ap, + bf, ~ αι 

+ b&, for any real numbers a, b. This is trivial since there exist βι and 
ff, such that βι = &, + dg, and B, = @ + df, >a, + dB, -- αᾶι 

+ b& +d (afi, + bj, ). Thus we can consistently define the linear 
combination aA, + bA, of equivalence classes A, and A, to be the 
equivalence class of the same linear combination of any of their ele- 
ments. We can thus regard the equivalence classes themselves as vectors 
in a vector space: the identity is the equivalence class of the zero 
vector of Z?,, the inverse of any class A is — A, etc. 

(ο) Take the vector (0, b) in R?. What is its equivalence class? It is all 
vectors of the form (0, b) + (a, 0) for arbitrary a and fixed b. The locus 
of points is a straight line parallel to the x-axis a distance b from it. In 
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this fashion we can identify the space of equivalence classes with the 
congruence of such lines. 

The vector field cannot vanish anywhere because the mapping leaves 
no point fixed: a fixed point would correspond to a block (+ 1) in the 
canonical form of 7, but T is already in canonical form with no such 
blocks. Notice that it is crucial that the sphere be odd-dimensional. 

(a) Trivial. 

(0) H"-!(S"~") is a one-dimensional vector space (R'), so any equiva- 
lence class is a multiple of any other. Since @ is not exact, it isin a 
nonzero equivalence class. By exercise 4.25(b) it follows that every 
equivalence class has a multiple of ὤ in it, so that for any @ there exists 
a number of such that & — αῶ 5: 0, i.e. is exact. Integrating over 11 
gives the value of a. 

(ο) Ιᾶ-- αῶ = ἆβ, then Bis a (η — 2)-form. Let its dual with respect 
to @ be V, V =*, or B = (— 1)” *V. Then dB = (-- 1)” (divs V) &. 
Let f equal the dual of &, so that we get (f—a) @ = (— 1)” (divs V)&. 
This is equivalent to what was to have been proved. 

(d) In this case G = xdy — ydx = ἆθ on the unit circle, where ϐ is the 
polar angle. Any other one-form & can be written as g(0)d0, so we wish 
to find (9) such that df= [g(@) — a] d@ everywhere. Since df 

= = (df/d6)d0, f solves df/dé = g(6)—a or f= f gd@ — a8. To make f 
continuous we require f(0) = f(27), or 27a = J" ρἀθ. This is the same 
as we deduced in (b) above. By reversing the reasoning in (b), we can 
conclude from this that Η1 (51) = R?. 

(a) We construct fas in the solution to exercise 4.24(c) above. 

(b) Suppose Μ is simply connected, and let & be any closed one-form 
field sufficiently smooth on M. Then JZ, & changes smoothly as the 
closed curve @ is contracted. But & can always be made small enough 
to be entirely within a region in which the Poincaré lemma (54.19) 
applies, in which Si, @ = 0. By continuity (i.e. by joining together a 
number of such small curves into a large one) it follows that fy & =0 
for any closed @. Then by (a) above, H'(M) = 0. The converse is 
similar. If H'(M) # 0 there is a closed one-form & which is not exact. 
By (a) above it follows that [ᾶ #0 for at least some closed curve @ in 
M. If this curve could be smoothly deformed to a point then we would 
have {,' & = 0 for all sufficiently small contractions @”’ of @. This 
would, as above, imply ᾗᾳ & = 0, a contradiction. So @ cannot be 
shrunk to a point. 

All and only linear combinations of {@,,..., @,,} are annihilated by 
every one-form {@’"**,... , &"}, so the complete ideal of these 
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one-forms is the same as that of the original forms. Let β be a g-form 
which annihilates every vector in Xp, and expand £ on the basis one- 
forms. Each term in β is the wedge-product of q basis one-forms, and 
each term must contain at least one of {@*", ..., @"}. If not, that 
term would not annihilate every vector in Xp. This expansion of 8 can 
be written as given in the exercise. 

As in exercise 4.29, expand ¥ on 8 one-form basis {@,,..., &m, 
gmt... , &"}. If 7 is in the ideal, each term has at least one of 
{@1,..., &,}and therefore satisfies (4.90). Conversely, if ¥ is a g-form 
(q <n—m) and satisfies (4.90), construct a set of vectors {x, 
J2.-++-sVm+q}in which x is in the annihilator Xp of the Gs, and the 
y;s are not. Let the (ση + q)-form in (4.90) operate on this set. The 
only nontrivial terms occur when x is an argument of ¥. If (4.90) is to 
vanish even with these terms, for arbitrary y;s, it must be that 7) 

= 0 for any x in Xp. This means ¥ is in the complete ideal. The remain- 
ing case, g +m >n, renders (4.90) an identity, but then ¥ when 
expanded upon the above basis necessarily involves at least one ι in 
each term, and so is in the ideal. 

(a) Let β be in the complete ideal of the set {@;}. Then since 8 

= © ¥/n&; for some set {71}, we have dB = Σ (dyn G& + Yin da). 
The first term is in the ideal; the second is also, since da; can be written 
as & fi a Gp. 

(b) Use (4.90) and the fact that a p-form for p > n vanishes identically. 
(b) Any curve satisfying U = const, V = const has a tangent vector 
which annihilates dU and d V, hence & and β, and therefore lies in &. 
(ο) Use the test (4.90) to determine under what conditions d& and df 
are in the ideal. This involves considerable algebra, which is helped by 
the hint given in the problem. This makes, for instance, Badd -ο. 
The result is that we can write d&a &@a B= dfadga [—(dC + AAC 
+BaF)+f(d4+Bak)+g(dB+AaBt+BaD)). This must 
vanish everywhere on the manifold. The term in square brackets is 
proportional to dx A dy, which is independent of df a dg, so it must 
itself vanish. Since A, B, etc. are independent of f and g, this term 
vanishes if and only if the three terms in parentheses vanish separately. 
This gives the first three desired conditions. The remainder follow from 
dga&n B= 0. 

(d) Factor dx a dy out of each term. 

dy =w (dx a & + xd@) + dy a B+ ydB = — 0.” ydx a dt — w*xdy a dt 
+ w?xdy a dt + wydx a dt = 0. Equation (4.97) is equally easy to 
show. 
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Consider the dot product (VYim) *(*dYim) = San (e*° Yim,c) 

X (w?B Yim,p) = ων “Yim.clim.p which vanishes by the antisymmetry 
of Ωω“, Since the metric is positive-definite, these vectors cannot be 
parallel unless they vanish, which happens only at isolated points if 
130. 

Take out dS λ d7 and multiply by P?. 

(a) ἐρῶ- 0- ἆ[ῶ(ῦ)] since d& = 0. Since phase space satisfies the 
conditions of Poincaré’s lemma (54.19), there exists a function H such 
that 2(U) = dH, which leads to (5.16). 

(0) Use [£g, £7] = £5 νι to show that if U and V are Hamiltonian, 
so is [U, V]. This is the bracket operation for their Lie algebra. 
Antisymmetry of 6. 

Trivial. Use components. 

As in exercise 5.2(a) above, @(U) = dH. 

(b) Just algebra. 

(ο) By (5.32), {f, hy} =X,X,(n); , A= — fe, (AP 
=—XyX/(h); th, 8h = XE, 04). . 

(0) £76 = 0 because £50 =0. But ἐσσ5 (divg 0). 

Clearly £<,H = 0, and since X; has no momentum components 

af/ax' = 0 and 0f/aP; = — U'. Thus f = — U'P;. 

Use the transformation 


| 0 0 0 

(09 _ 0 cos? —sin@ 0 
] O sin @ cos@ 0 

0 0 0 | 


(a) A three-form in a four-space has C3 = 4 independent components. 
(b) For example, Fixy 2] Ξ 0 γε + Fex,y + Fyz.n =0= Bz 
+ By y + By x. This is (5.52c). 
Matrix multiplication. 
For example, PF” ,=F'* + F! , + FY’ =F, tByy + Boz. 
This gives (5.52d). 
(a) ων. = 3 (Wyztx lh” + ὠσγικ ή) =f = By. 
(“Fey = (ων + Wetxyl**) = FY δρ. 
The whole matrix is the same as (5.53) with B; > £;, E; > — B;. 
(b) Obvious from exercise 4.23. 
(a) Prove by showing the first equation gives V - B = 47p,,, etc. 
(ο) dd* F=030=d*J =d[@(VJ)] = (divg J) &. 
(a) Easily proved using components. 
(b) Note that we had to restrict to # in order to integrate, since *Tisa 
three-form. 
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(ο) The restriction to # of *T is Jtédx a dy a dz. The restriction of F 
to 9-52 (a surface of const { and r) is *Fogd0 a ἀφ. Now, “Foo = 
S(wpragh + ωὠγιοφί') = 9° sin 6 E,. (Recall equation (4.40) and 
exercise 4.21.) Therefore the integrals become, in conventional nota- 
tion, fpd°x = f E,r* sin 6 dé ἀφ. 

(a) For example, consider the (¢, x) component of (5.64): Fy, = Ax. 4 
— A;,,. Compare this with the usual definition, F’, = 6 , + Ax. κ. 
Since δι = — E,,, the identifications follow. The remaining equations 
are consistent. 

(b)@> otf A’ >A'—Vf 

(ο) Static charge q at the origin: all components of B vanish and E 

= gre,. Only (*F Jeo does not vanish, equalling (as in exercise 5.15) 
q sin 6. This gives ag 9 — ἄθϕ = q sin 6. Two possible solutions are 
{A =—q cos 6, A =a; =a, = O} and {ay = —g¢ sin 0, ag = ay 

= a, = 0}. These differ by a gauge transformation, and both render all 
other components of d& zero. But neither defines a well-behaved one- 
form. The first is undefined at the poles 6 = 0 and @ = π; the second 
is multiple-valued. 

Trivial. 

(£4 Wy = [U, W]'=U'wi - Wu? ,; = UW! + USWE WU, 
where sums on α run over only (x, y,z). Since U' = 1 and U* 4 = 0 the 
result follows. 

The results follow from At, = or’ /ox = 0, Al, = dt/dx' = 0. 

dp = (0p/0p) dp + (0p/0S) dS. This causes (5.77) to vanish because 
dS a dS=O=dpa dp. 

Take the dual of (5.80), as in (5.82), and work out the components. 
Use (3.37) in Cartesian coordinates to show that (5.85) and (5.86) 
reduce appropriately. Since these are tensor equations, their validity in 
one coordinate system assures their validity in all. 

The isometry group SO(3) clearly has elements which move any point 
of the sphere into any other: join them by a great circle and rotate 
about an axis perpendicular to that circle. So S* is homogeneous. The 
isotropy group of a point P is the set of all rotations which leave P 
fixed. This is obviously the subgroup SO(2) of SO(3), so S? is isotropic. 
(a) Since V' must vanish at P (the isotropy group leaves P fixed), its 
Taylor expansion is (5.96), for some matrix A‘. From (5.89) with 

I’,; = 0 in our coordinates (in which there is also no distinction 
between raised and lowered indices near P), we conclude that A‘ 

+ A’, = 0, which is (5.97). 

(0) (5.98) is simple algebra. 
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(ο) The isotropy group has the same Lie algebra as SO(7), so the 
groups are identical at least in some neighborhood of the identity 
element. But a small neighborhood of P itself can be mapped 1-1 onto 
a neighborhood of the origin of R™, and by (a) above the Killing fields 
can be mapped into one another to O(x”). Therefore their isotropy 
transformations can be put in 1-1 correspondence and the groups are 
identical. 

(d) If gj is not positive-definite, raising and lowering indices can involve 
sign changes, in our coordinates. Then (5.97) is properly A;; = — 4, 
but A‘, #— Al ;. So the Lie algebra of the isotropy group does not 
involve antisymmetric matrices. 

5.25 (a) The line ϐ = const, d = const is geometrically defined as an integral 
curve of Π. If the radial distance between spheres differed in different 
directions, the manifold would not be isotropic. Therefore g,.,. is inde- 
pendent of 6 and @. 

(b) Near P we can construct the coordinates of exercise 2.14 and trans- 
form to spherical polar coordinates via the standard flat-space transfor- 
mation. These new coordinates are identical (as r > 0) to those of 

(5.100), because the area of the spheres fixes r. Thus (5.102) is forced. 

5.26 Algebra. 

5.27 These correspond to ζ1ηι = const. It is easy to see that the norm of 
such a vector goes to zero as r > 0, so it is in the isotropy group’s 
algebra. The isotropy group SO(3) is three-dimensional, and so is the 
set of all vectors generated by the three constants ¢,,,, so these vectors 
are the entire isotropy group. 

5.28 Convert to Cartesian coordinates, or compute the norm of a vector 
with V,, = 1 and ¢,,, = 0. 

5.29 Clearly f= 1=>S is E?. 0/0x = cos ¢ sin 6 9/9Υ +r! cos ¢ cos 6 9/90 
—F sin @ cos 6 0/0¢ is the vector generated by Vy = V_, = (21/3), 
Sim = 0. 

5.30 From the second and third diagonal components of the matrices in 
(5.119) we conclude r = sin x/\/K andr = sinh x/\/|K | respectively. 
One needs to verify the first diagonal component. For example, in the 
case K > 0, g,. = &),(0r/0x)* = (1 — Κι)’ cos” x/K = 1/K, as 
required. 

5.31 w=rcosyx,x =rsin x sin 0 sin ¢,y = rsin x sin 6 cos ¢,Z 
=r sin x cos 6. Then for example ggg = (dw/00)* + (0x/00)* 

+ (9γ/9θ)” + (02/00)? =r? sin? x. So withr? = K~' the identification 
is complete. 

5.32 (a) The key point is that for K <0, area/4n (radial distance)” 
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= sinh? y/y? > 1. Consider a sphere in a submanifold of E”. Because 
of E”’s positive metric, the distance from the centre of the sphere to 
the sphere along a curve in any submanifold of E” is always larger than 
the ‘true’ radius of the sphere, so that for any spherically symmetric 
submanifold of E”, area/4m (radial distance)” <1. Therefore the open 
universe cannot be ‘embedded’ in any Euclidean space in a manner 
that preserves its metric. This is true even though the open universe has 
a positive-definite metric! 

(0) Consider the hyperboloid — 7? + x? +y? +z? =K1(<O)in 
Minkowski space. Define the usual spherical coordinates by x = r sin 6 
x cos ¢,y =rsin 0 sin ¢,z =rcos 0, and then the submanifold is ¢ 

= (r* —K')"*. The metric tensor has components g,,. = — (dt/dr)” 

+ (dx/dr)? + (dy/ar)? + (dz/dr)? = (1 — Kr’ )*. Similarly g99 =r’, 
Sa =F’ sin? 6. So this hyperboloid is isometric to the open universe. 
Just apply the definition of a tensor. 

Algebra from (6.6), using ει = A’) é,,. 

Trivial. 

In figure 6.1 the basis vector ég is unchanged by transport in the 6- 
direction: Vg, @ = 0. So from (6.6) we conclude Tog =P 99 = 0. 

In figure 6.2 imagine that at D the vector is δρ. Then its direction does 
not change as it is transported in the @-direction, but its scale does, 
since 60 - 0 at the poles but the transported vector does not. In fact, 
|ég| = sin 6, so we conclude ἔφ(θ + 60) — @4(8) = εφ(θ) [(sin θ)/ 

sin 6], or Ὑερέφ = οοί Oy, or 1? 46 = cot 6, Γύρο = 0. The corre- 
sponding derivatives in the @y direction are harder to calculate because 
the curves ϐ = const are not great circles. Consider a point P at ϕ 

= 0, 6 = 6o. In figure A.1 we have drawn a neighborhood of P the way 
it looks to someone standing there. The curve 6 = 69 through P is not 
a straight line locally, but is tangent to a great circle at P which does 
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look straight. Consider the point ο at ¢ = δφ <1 on the circle 0 = @. 
To calculate, say, Vs οφ at P, we need the difference ἔφ(Ο) — ég(P), 
to first order in δφ. Our coordinates are not Cartesian, so we cannot 
simply subtract components of vectors at different points. Instead, we 
construct a vector field V on the great circle by parallel-transporting 
ἔρ(8) along it, i.e. by keeping it tangent and of the same length. The 
point R with coordinate @  δφ is very near Q, their separation being 
Ο(δφ2). Therefore to first order we can use V(R) as a reference and 
approximate @,(Q) — @,(P) © ερ(Ο)-- V(R), which we can calculate 
simply by subtracting components. Now, in our coordinates δρ has 
components (0, 1) everywhere. So we must construct V. The great 
circle is the intersection of the sphere x? + y? + z* = 1 with the 
plane x =z tan @q. In spherical coordinates this gives the equation of 
the great circle to be sin 6 = sin 09 (1 — cos?@ sin’) "7. Using ϕ 

as a parameter on it gives its tangent vector (d@/d¢, 1) = (sin @5 cos 6 
x sin ¢/(1 — οο57θο sin’¢), 1). At ¢ = 0 (point P) this equals @4(P), 
and at φ δφ (point R) it is (sin ϐρ cos 69 6¢, 1) to first order. Again 
to first order, this has the same length as @,(P), so this is in fact V(R). 
We therefore get Vz,2y = limgg+o (2) — V(R))/5¢ = C sin A 

X CoS #9, 0). It follows that Γρ = — sin 0 cos 0, nen = 0. A similar 
calculation for Velo gives 169 = cot 0, M99 = 0. 

(657, Ep) = 57, 80 (VG, ει) = — (G!, νε) = — Τόνι. Therefore V, a! 
is a one-form whose kth component is — I”;,;, as in (6.8). 

Use exercise 6.5 and follow the steps leading to equation (6.10). 

As above. 

In a coordinate basis, [é;, é;] = 0. 

Need to show it is linear in its arguments. For example, T( ;fU, V) 

= Va V — Ve (fU) — [U,V] = fa V — Ve 0 — [U, γι) νυν 
+ UL ;(f). The terms involving derivatives of f cancel, so T is indeed 
linear on that argument. 

Similar to the proof of exercise 6.9. 

One needs only to show it for scalars (where they are both the same) 
and vectors (which is the content of (6.13)). Since both derivative 
rules generalize to tensors of higher rank via (6.3), the result follows 
for all tensors. 

Obvious. 

Algebra. 

(a) In a coordinate basis [2;, é,] = 0. νε = VI" ει = Vj τι 

+ ag Viey = ny τει + Γη Γ 2m. Antisymmetrizing on i and j and 
relabelling some indices gives the result. 
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(b) Obvious. 

(c) (6.23a) is obvious from (6.19), but (6.23b) must be proved, In 
normal coordinates at P, I";,(P) = 0 so that R'pj(P) = Pig, νι. 
Then 3R! [Rij] = = jg eg μμ”... 
because Γή, = I™p;. 

(d) The four indices mean that we begin with n* components. Equation 
(6.23a) are n? + 4n(n + 1) separate relations, since / and k are free, 
while there are $n(n + 1) symmetric pairs ({ῇ). (This is the same as the 
number of independent components of a symmetric η x n matrix.) 
Constraint (6.23b) is entirely independent of (6.23a) since it involves 
only R' aij) . There are n(n — 1) (n — 2)/3! different antisymmetric 
triplets (kij) in this equation; with the free choice of / this gives the 
third term in (6.24). 

6.15 In normal coordinates at P, Κιμ = R', iim = Γι µη — hij: 
the first term is symmetric in (im), the second in (jm), so both vanish 
in (6.25). The Jacobi structure follows from (6.19). 

6.16 (a) @,= cos 6 ἐς +sin 0 é,,@ =—rsin dé, + rcosé é, > Veg er 
=—sin 0 @, + cos @é, =r 'é > io =r," I” .g = 0. Others are 
derived the same way. 

(ο) V",=V",. Vg =V" 9 —1V?. V8 = VE VE Ir. Vg 
=V? 9 tV ir Vig=V tr V+ Σα 

6.17 (A) (EFS): = Oi κι t+ Oi eV = ωι κι + i.e V'q. Then 
V'.. = diva V for all V if and only if νῶ = 0. 

(0) 5st = @i.1 mi. = PF ΓΕ mi) Gi. 

6.18 (a) V7 [οι(4, B)] = (Veal) (4, B) + g(VeA, Β) + OCA, νν Β) 
= (V7 g)) (A, B). This vanishes for all A, B, V if and only if Vo] = 
(b) Equation (6.29) implies 2;; , = =I", ει + Min 2. Adding the 
various gs together as on the right-hand side of (6.30) gives the result. 

6.19 We need to show that (6.30) implies I, = = (In|g|"’),. This is easy 
once we have proved that g ,, = 27'g;; p- (This 1 is true for the deter- 
minant of any matrix.) We begin with g = e'-""g,;.. . nz 

We το, εν Εν). From this we can see that if we define g"! 

= gig 82)... .δηι then we have 1 σι]. Moreover, 621011 =0 

by the antisymmetry of e. We therefore have an explicit expression for 


the inverse of g;;: g4 = (n= -- 2 aie Igil..meik.rg = og. Now we 
use exercise 4.12(b), g = e!! "22k - --8mr/N!, to deduce 
δια ne m εἰ τι. "Si, a&lk - -- —~ Si, et The rest is Casy. 


6.20 Replace commas by semicolons in equation (3.37) and use ειν = 0. 
6.21 Algebra. 
6.22 (b) One cannot simply subtract from equation (6.24) a new number 


6.23 
6.24 


6.25 
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representing the number of constraints in (6.33), because these new 
constraints may not all be independent of the previous ones. Instead, 
we begin anew and concentrate on pairs of indices. By (6.23a) there 
are n(n — 1) independent pairs, so (6.33) implies R is a symmetric 
matrix in a space of n(n — 1) dimensions, i.e. has 4 [n(n — 1)/2] 

x [n(n —1)/2 +1] =n(n—1)(n? — n+ 2)/8 independent components. 
Now, (6.23b) represents fewer independent constraints than before. 
Given all triples (kij) (n(n — 1) (n — 2)/3! possible sets), does every 
choice of / give an independent constraint? No, because (6.23a) and 
(6.33) enable us to manipulate Άδη μυ = Ring + Ryn + Rigr 

= Reyit νι + Rei = Άδιμῃ . This means that we have new infor- 
mation only for every set of four indices, all different: n(n — 1) (n— 2) 
x (n — 3)/4! constraints in all. The result is as given in the problem. 

(a) From (6.33) and (6.238). 

A geodesic is defined by equation (6.16a). From this it is easy to see 
that Vg gi(U, U) = 0, so that if g)(U, UV) #0 small variations in the 
path will not change the sign of g\(d¥/dA, dx/dA). We will do the 

case of a space-like geodesic first. By the calculus of variations, 

5 [ (611112 dd when the path is changed by δα (λ) is, to first order, 
-ᾱ f8x'(r) — 2 d(gyX)/dr + gig -XIX") (By ΙΧ Υ24λ 

= [δχι(λ) (KX? + Ty, ΙΧ) (ΧΙ) V? da. (Here dots stand for 
d/dd.) This implies a geodesic is an extremum. The proof for time-like 
geodesics is nearly identical. The case of a null geodesic is handled by 
breaking the integral into segments, in each of which the variation is 
time-like or space-like or null. 

(a) Du = Ὑμν —iAy. Due? YP) = We? ψ)-- iG, + We) 

= (VW —iAyW) =e? Day. 

(b) Dy Db = Vu WW .. Αν Ὑμψ —i(VpAv) . iAy WY —AyAy ψ. 
Since V, VW = Νν νμψ we have [D,,, Ὀν]ψ  -Ι(44)ψ. 
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Symbols are listed with the page(s) on which they are defined or extended. For 
conventions on the placement of indices and the summation convention, see 


§2.21 and §2.26. 


Types of tensor 
a, 13 

b, 49, 116 

Ε, 57 

r, 44 

w, 129 


Special tensors, ete. 
Ol, δι]. 65 
0/1, 7, 67 
R( , ), Rijn 211 
ει ες πι 128 
δµ, 17 
ike 205 
Aj, 61 


Special spaces 
E”", 15 

GL(n, C), 100 
GL(n, R), 41 
L(n), 67 
O(n), 66 

R, 2 

κ. 1 

5”. 26 

SO(n), 29, 98 


SU(n), 100 
Tp, 34 
T*p, 53 
TM, 36 
T*M, 53 
U(n), 100 


Operations 

[, ], 11, 14, 102 
*T, *B, 125, 126 
x, 38 

®, 59 

{ῶ, 122 

αἲ απ. 17 

d, 53, 134 
det(A), 18 

dive, 148, 149 
exp( ), 43 
f:M-N,6 
fixby, 6 
gof,7 

Lg, 92 

£7, 76 

tr(A), 18 

νι, 135 

γη. 207 

aly, 120 
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ἄλ B, 118 σον, 117 

V, Va. V;, 203-6 f(S), f(T), 6 

6&(U), 50, 119 (N'), 57 

(@, U), 50 O(fi,--+»>fn)/OH%1,.-+5Xn)s 9 
90, 144 

Miscellaneous 1-1, 6 

σ.σ, ὃ 


Cc’, 10 


INDEX 


accelerated observer, 71 

adiabatic, 182, 183 

adjoint transformation, 107 

affine connection, 76, 201—222, 218, 221; 
linearity, 204; metric, 216; not a tensor 
field, 205; symmetric, 207, 208, 216; 
torsion, 207 

affine parameter, 209, 218 

analytic function, 9, 10 

analytic manifold, 26 

angular momentum operators, 85; as Killing 
vector fields, 89 

annihilator, 154 

annulling a form, 120, 153, 166 

antiderivation, 134 

antisymmetric tensor, 115 

area, 113; tensor, 115 

atlas, 25 

automorphism, 107 

axial eigenvalue, 90, 91 

axial harmonics: scalar, 90; vector, 90 

axial symmetry, 87, 89; and group theory, 
91 

axial vector, 126 


basis, 14; Cartesian, 66; commutation 
coefficients, 211; coordinate, 34, 47, 56; 
dual, 55; for one-forms, 55; globally 
orthonormal, 69; handedness, 99, 115, 
121, 132; Lorentz, 66; noncoordinate, 
44, 211; orthonormal, 66, 68, 132, 184; 
transformation of, 60 

basis transformation, 60; matrix of, 61 

basis-invariance, 63 

Betti number, 152 

Bianchi identities, 212 

big bang, 199 

bijection, 7 

boundary of a region, 144 


canonical energy, 173 

canonical momentum, 173, 221 
canonical transformation, 168 
Caratheodory’s theorem, 165 


Cartan, E., 113 

chart, 24 

Christoffel symbols, 205, 218; of two- 
sphere, 206; transformation law, 205 

closed form, 138; locally exact, 138, 140; 
on a sphere, 150; sufficient condition 
for exactness, 142 

closed ideal, 163 

closed set of forms, 158 

cofactor, 18 

cohomology, 139, 142, 150; and simply 
connectedness, 152; classes of n-sphere, 
151,152 

commutator, 44; of covariant derivatives, 
210; of operators, 11; of vector fields, 44 

compatibility: of connection and differential 
structure, 204; of connection and metric, 
215, 216; of connection and volume 
form, 215, 216; of metric and volume 
form, 216 

components: see under individual types of 
tensor 

composition of two maps, 7 

configuration space, 174 

congruence, 43, 151 

connection: affine, see affine connection; 
metric, 216 

connection one-form, 220 

conservation law, 89, 149, 158, 163, 171, 
173, 182 

conservative, 168 

conserved quantities and Killing vector 
fields, 171 

continuity: function, 7, 8; group, 12; map, 7; 
one-form, 52; space, 1, 2 

continuum mechanics, 58 

contraction, 50, 56, 59; of vector with 
form, 119 

coordinate transformation, 31, 62, 255 

coordinates, 23, 24; curvilinear, 70; normal, 
210 

Copernican principle, 189 

cosmological principle, 189 


Index 


cosmology, 161, 186-199: big bang, 199; 
closed, 198, 199; expanding, 199; flat, 
198, 199; homogeneous, 187, 189; 
isotropic, 187, 189; open, 198, 199; 
standard model, 199 

cotangent bundle, 53, 174, 175 

covariant derivative, 184, 203; com- 
mutator of, 210; exponentiation of, 212; 
Jacobi identity for, 212; Leibniz rule for, 
204; of a scalar field, 204; of a tensor 
field, 206; of a vector field, 203 

cross-product, 125, 126; triple, 131 

cross-section, 38, 53, 220 

curl, 136, 176 

curvature, 68, 214 

curve, 30, 153; congruence, 43, 73, 151; 
geodesic, 208; parameter, 30; spacelike/ 
timelike/null, 70; tangent vector, 32 


density, scalar, 129; tensor, 129; weight, 129 

diffeomorphism, 30, 73, 92, 188 

differentiability class: of a function, 8; of a 
one-form, 53, 56 

differential form, 68, 117; annulling, 120, 
153, 166; closed, 138, 140, 142; degree 
of, 117; exact, 138, 140, 142, 163, 170; 
field, 120; independent components of, 
117; integrable, 165; Lie derivative of, 
142; restriction of, 120, 153, 188; 
sectioning, 120 

differential forms: annihilator of, 154; 
closed ideal, 163; closed set, 158; com- 
mutation rule, 119; complete ideal, 154, 
163; differential ideal, 155; surface- 
forming, 155 

directional derivative, 33, 53 

Dirac bra and ket, 51 

Dirac delta function, 51 

direct product, 59 

directional derivation, 33, 53 

discrete set, 2 

distance function, 1, 3—5 

distance on a manifold, 68, 70 

distribution (of vector fields), 83 

distribution (over functions), 51 

divergence, 137, 176, 196; in spherical 
coordinates, 148; of a p-vector, 149; of 
a vector field, 147 

divergence theorem, 147 

domain of an operator, 11 

dual, 186, 196; double, 128, 133; metric, 
128; of a p-form, 125; of a p-vector, 125 

duality of vectors and one-forms, 50 


eigenvalue, 19, 65, 96, 99 
eigenvector, 19 
Einstein summation convention, 56, 62 
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electromagnetism, 175—181; as a gauge 
theory, 219-222; charge, 178; charge 
and topology, 179—80; current four- 
vector, 177, 178; Faraday tensor, 176, 
221; gauge transformation, 180, 219, 
221; magnetic monopoles, 178, 180; 
one-form potential, 180, 221, 219; 
plane waves, 181; polarization, 181; 
vector potential, 180 

entropy, 163, 166, 167, 182, 183 

equation of continuity, 149 

equation of state, 163 

equivalence class, 150 

equivalence relation, 150 

Ertel’s theorem, 185 

Euclidean space, 15, 65, 79, 121, 161, 176, 
182, 197, 198, 214, 218 

Euclidean vector algebra, 68, 125, 131, 132 

Euclidean vector calculus, 70, 136, 137, 
142, 147, 148, 182 

exact form, 138, 163, 170; on a sphere, 
149, 150 

expansion, 186 

exponential map, 167, 210, 215 

exponentiation: of an operator, 43; of 
covariant derivative, 212 

exterior derivative, 134; commutes with Lie 
derivative, 143; Leibniz rule for, 134 


Faraday tensor, 176, 221 

fiber bundle, 35, 36, 40, 53, 183, 220; base 
manifold, 36; cross-section, 38, 53, 220; 
fiber, 36, 40; global properties, 38; 
globally trivial, 38; locally trivial, 38, 
40; principal, 42; projection, 37, 40; 
structure group, 40; 

fixed-point theorem, 39, 151, 196 

flat manifold, 172, 214 

fluid: multicomponent, 164; perfect, 181; 
single-component, 163 

fluid dynamics, 149 

foliation, 81, 108, 188; leaf of, 81 

frame bundle, 42 

Frobenius’ theorem: 81, 82, 153-155, 163, 
165, 166 

function, 5, 30: analytic, 9, 10; as a tensor, 
58, 64; continuous, 7, 8; differentiable, 
31 

function space, 51 

fundamental theorem of calculus, 134 


Galilean spacetime, 182, 183, 218 

gauge: curvature two-form, 221; theories, 
219; transformation, 180, 219, 221 

gauge-covariant derivative, 220 

Gauss’ theorem, 147, 148 


Index 


geodesic curve, 208; affine parameter of, 
209, 218; extremal length of, 218 

geodesic deviation, 213, 214 

geodesic equation, 208 

geodesically complete manifold, 210 

geometrical object, 62 

GL(n, C), 100 

GL(n, R), 41, 95; acting on R”, 99; Lie 
algebra of, 97 

gradient, 53; not naturally a vector, 54; of a 
vector field, 204; vector, 69, 71, 89 

Grassmann algebra, 118, 125 

group, 11—13; abstract, 105; homomor- 
phism, 13; isomorphism, 12; isotropy, 
188, 191, 192, 194, 197; Lie, see Lie 
group; Lorentz, 67, 192; permutation, 
12; realization, 106; representation, 106; 
rotation, 29, 98, 99; translation, 12 


Hamiltonian: equations, 167; function, 170, 
171, 173, 174; vector field, 168, 170, 
171 

handedness, see under basis 

harmonic oscillator, 153, 159 

Hausdorff property, 3 

Helmholtz circulation theorem, 185 

Hermitian conjugate, 100 

Hilbert space, 51, 108 

homeomorphism, 40 

homogeneous 187, 188, 191 

homomorphism, 102, 104 

hypersurface, 79, 167, 178, 182, 183, 188; 
see also submanifold 


ideal: closed, 163; complete, 154, 163; dif- 
ferential, 155 

identity transformation, 17 

image, 6 

index notation, 73 

index raising and lowering, 68 

indices: antisymmetrized, 166; placement of, 
56, 62 

infinitely differentiable, 9 

inner automorphism, 107 

inner product, 15, 51, 71 

integrability conditions, 138, 155 

integration, 134; and orientation, 123; 
change of variables, 9; of forms, 121; 
of functions, 121 

into (map), 7 

invariance of a tensor field, 86; see also 
Killing vector field 

inverse function theorem, 9, 35, 49 

inverse image, 6 

inverse map, 6 

inversion, 99 

isometry, 188, 190 
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isospin, 37 
isotropic, 187, 189 


Jacobi identity: for covariant derivatives, 
212; for Lie derivatives, 78; for vector 
fields, 47; for Poisson brackets, 170 

Jacobian, 9, 122, 129, 132 

Jacobian matrix, 9, 35, 63 


Killing vector field, 88, 108, 171, 188, 192, 
195, 216; nonexistence for a general 
metric, 190 

Killing’s equation, 216 

Klein—Gordon equation, 174, 219 

Kronecker delta, 17 


Lagrangian function, 167, 174 

left-invariant vector field, 93—95 

Leibniz rule: for covariant derivative, 204; 
for exterior derivative, 134; for Lie 
derivative, 78, 79 

LeviCivita symbol, 128; and determinants, 
131; and p-delta symbol, 130; products 
of, 130 

Lie algebra, 47, 92, 93, 95, 97, 101, 155, 
170; Abelian, 93, 105; dimension of, 88; 
of invariant vector fields, 87; of isometry 
group, 190; of SO(3), 100; of SU(n), 
101; structure constants, 93; subalgebra, 
108 

Lie bracket, 45, 75, 157; closure of a set of, 
158; picture of, 46 

Lie derivative, 68, 173, 182, 184; as a partial 
derivative, 78; commutes with exterior 
derivative, 143; components of, 79; of a 
differential form, 142; of a one-form, 79; 
of a scalar, 76; of a tensor, 79; of a vector 
field, 77; 

Lie dragging, 73, 90, 209, 213; of a 
function, 73; of a one-form, 70, 79; of a 
region, 144; of a vector field, 75 

Lie group, 12, 29, 87, 92, 188; abelian, 105; 
adjoint representation, 189; and 
invariance, 92; component of the 
identity, 97; covering group, 102; dis- 
connected, 97; left and right translations, 
92; Lie algebra of, 93; one-parameter 
subgroup of, 94, 95; simply connected, 
102; tangent bundle, 94; transitive 
action, 188 

Lie subalgebra, 192 

linear combination, 14 

linear independence, 14 

linear transformation, 16; components of, 16 

Liouville’s theorem, 171 

Lorentz frame, 70, 71 

Lortentz group, 67, 192 

Lorentz transformation, 67, 219 


Index 


manifold, 23; analytic, 26; complex, 51; dif- 
ferentiable, 23; differential structure, 
201; dimension of, 23; geodesically com- 
plete, 210; homogeneous, 188, 191; 
isotropic, 189; maximally symmetric, 
191, 193; orientable, 41, 132, 175; 
Riemannian, 169, 201—222; simply con- 
nected, 102, 152; symplectic, 171 

many-to-one, 6 

map, 5; composition, 7; continuous, 7; dual, 
125; exponential, 167, 210, 214; 
generated by a congruence, 73; into, 7; 
inverse, 6; many-to-one, 6; of p-forms to 
n-vectors, 125; one-to-one, 6; onto, 7; 
stereographic, 27 

matrix, 50; anti-Hermitian, 100; block- 
diagonal form, 96; canonical form, 96, 
97, 99; cofactors of, 18; determinant, 
18, 131; diagonal, 65; inverse, 17, 19; 
nonsingular, 17; orthogonal, 65; singular, 
17; trace, 19, 101; transpose, 17 

Maxwell identities, 164, 169 

Maxwell’s equations, 175-181 

metric, 36, 38, 51, 54, 76, 113, 169, 178, 
184, 201, 215; Euclidean, 65; 
Minkowski, 66, 71; signature, 66, 69 

metric connection, 216 

metric dual, 128, 133 

metric tensor, 64, 187, 214; as a map of 
vectors to one-forms, 67; canonical 
form, 66; indefinite, 66, 133; inverse, 
67; negative-definite, 66; positive- 
definite, 66 

metric tensor field, 68, 108; local flatness, 
69; signature, 68 

metric volume element, 132, 148, 160, 193 

Minkowski metric, 66 

Minkowski space, 70, 79, 179, 214, 217, 
218, 219, 220 

Mobius band, 39; not orientable, 121, 124; 
structure group, 42 

multilinearity, 57 


n-tuple, 1, 15, 23 

n-vector, 125 

neighborhood, 1; generalized, 5 
Newtonian gravity, 187 

norm, 14; Euclidean, 15 
normal coordinates, 210 
normal one-form, 147 


O(n), 66, 98; as a disconnected group, 98; 
dimension of, 99 

one-form, 49; as a tensor, 58; basis, 55; 
components, 55; coordinate basis, 56; 
dual basis, 55; field, 52; normal, 147; 
picture of, 53 
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one-parameter subgroup, 94, 95; infini- 
tesimal generator, 96 

one-to-one, 6 

onto (map), 7 

open set in R”, 2, 3 

operator, 10; as a tensor, 58; domain of, 11; 
extension of, 11; multiplicative, 211 

orientability, 41, 132, 175; internal, 121 

orientation: external, 123; internal, 123 

orthogonal group, see O(n) 

outer product, 59, 64 


p-delta symbol, 130; contraction of, 130 

parallel transport, 202, 204; around a loop, 
213 

parallelism, 76, 201; global, 202, 214 

parallelogram rule, 15 

partial differential equations, 134, 137, 
152; integrability conditions, 138, 155 

permutation group, 12 

phase space, 28, 168, 174; volume form, 171 

Poincaré lemma, 140 

Poisson bracket, 170 

principle of equivalence, 218 

principle of mediocrity, 189 

principle of minimal coupling, 218 

product space, 38 

proper distance and time, 70 

pseudo-norm, 15, 71 


quantum mechanics, 51; commutation 
relations, 85 
quotient space, 150 


R,2 

R",1 

realization, 106; faithful, 106; group 
adjoint, 107; of SO(3), 106; principal, 
106; progressive, 106; retrograde, 106 

relativity, 38, 188; general, 217, 218, 219; 
special, 66, 68, 70, 219 

representation, 106; abstract, 110; adjoint, 
107; irreducible, 109; of SO(3), 106, 109 

restriction of a form, 120, 153, 188 

Ricci scalar, 217 

Ricci tensor, 217 

Riemann tensor, 211, 213; number of 
independent components, 212, 217 

Riemannian geometry, 76, 184, 187 

Riemannian manifold, 169, 201—222 

right-invariant vector field, 94 

rotation group, see SO(3) 

rule of linearity, 16 


scalar, 64; covariant derivative of, 204 
sectioning a form, 120 
shear, 186 
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signature, 66, 68, 69 

similarity transformation, 19, 65 

simply connected manifold, 102; and 
cohomology, 152 

SO(2), 92 

SO(3), 29, 98, 99, 100, 111, 160, 190, 192; 
double-valued representations of, 111; 
fundamental representation of, 111; 
global topology, 104; not simply con- 
nected, 105; realization of, 106; rep- 
resentation of, 106, 109 

SO(n), 98, 189, 192 

Sorkin’s model for charge, 180 

spherical harmonics, 108, 109; as eigen- 
functions, 110; completeness of, 109, 
110, 160; vector, 160, 161, 195 

spherical symmetry, 92, 108, 192 

spinor, 111 

square intergrability, 10 

Stoke’s theorem, 144, 147, 149, 150, 159 

stress tensor, 58 

SU(2), 111; diffeomorphic to the three- 
sphere, 103; double covering of SO(3), 
104; global topology, 104; Lie algebra 
of, 101 

SU(n), 100, 101 

subgroup, 12 

submanifold, 79, 80, 90, 153, 157, 158, 
188; one-forms of, 81; tangent vectors 
of, 80; see also hypersurface 

symmetry: axial, 87, 89 

symmetry group, Eucliedan, 66 

symplectic: form, 171, 174; inner product, 
171, 172, 174; manifold, 171 


tangent bundle, 36, 37, 174; of a Lie group, 
94; structure group, 41 

tangent one-form, 53, 54 

tangent space, 34 

tangent vector, 32; space of, 33 

Taylor expansion, 9, 167 

tensor, 57; antisymmetric, 115; completely 
antisymmetric part of tensor, 116; com- 
pletely antisymmetric tensor, 115; com- 
ponents of, 59; order, 68; symmetric, 
64, 116; type of, 57 

tensor equation, 64 

tensor field, 57; components of, 59 

tensor operation, 64 
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tensor product, 59 

thermodynamics, 163—167; Caratheodory’s 
theorem, 165; composite system, 164, 
165; entropy, 166, 167, 182, 183; first 
law, 163; second law, 163, 166 

topographical map, 53 

topological space, 3 

topology, 51; and charge 179, 180; global, 
1, 28, 139, 140, 214; induced, 3, 5; 
local, 1; of SO(3) and SU(2), 104 

torsion tensor, 207, 209 

trace, 19, 101 

transformation law: for basis one-forms, 61; 
for basis vectors, 61; for Christoffel 
symbols, 205; for one-form components, 
61; for tensor components, 62; for 
vector components, 61 

translation, 219 

translation group, 12 

transpose, 17 


U(1)-bundle, 220 

U(n), 100 

unit matrix, 17 

unitary group, 100 
universe, see cosmology 


valid tensor equation, 177 

vector, 32—34; as a tensor, 58; components, 
14, 32, 55; contravariant, 50, 62; co- 
variant, 50, 62 

vector field, 34, 42; components of, 34; co- 
variant derivative, 203; gradient of, 204; 
Hamiltonian, 168, 170, 171; integral 
curves, 42 

vector space, 13; as a manifold, 28, 172; 
complex, 16, 51; dimension, 14 

vector subspace, 14 

volume element, 113 

volume form, 121, 171, 175, 177, 182, 201, 
215, inverse, 126 

vorticity, 182, 184; conservation of, 185 


wedge product, 117; components of, 118; 
forms of arbitrary degree, 118 

Weyl tensor, 217 

Wheeler’s model for charge, 179 


zero-form, 117 


